# VGA CONTROLLER I: GRAPHIC

#### 12.1 INTRODUCTION

VGA (video graphics array) is a video display standard introduced in the late 1980s in IBM PCs and is widely supported by PC graphics hardware and monitors. We discuss the design of a basic eight-color 640-by-480 resolution interface for CRT (cathode ray tube) monitors in this book. CRT synchronization and basic graphic processing are examined in this chapter, and text generation is discussed in Chapter 13.

## 12.1.1 Basic operation of a CRT

The conceptual sketch of a monochrome CRT monitor is shown in Figure 12.1. The electron gun (cathode) generates a focused electron beam, which traverses a vacuum tube and eventually hits the phosphorescent screen. Light is emitted at the instant that electrons hit a phosphor dot on the screen. The intensity of the electron beam and the brightness of the dot are determined by the voltage level of the external video input signal, labeled mono in Figure 12.1. The mono signal is an analog signal whose voltage level is between 0 and 0.7 V.

A vertical deflection coil and a horizontal deflection coil outside the tube produce magnetic fields to control how the electron beam travels and to determine where on the screen the electrons hit. In today's monitors, the electron beam traverses (i.e., scans) the screen systematically in a fixed pattern, from left to right and from top to bottom, as shown in Figure 12.2.



Figure 12.1 Conceptual diagram of a CRT monitor.



Figure 12.2 CRT scanning pattern.

| Red (R) | Green (G) | Blue (B) | Resulting color |
|---------|-----------|----------|-----------------|
| 0       | 0         | 0        | black           |
| 0       | 0         | 1        | blue            |
| 0       | 1         | 0        | green           |
| 0       | 1         | 1        | cyan            |
| 1       | 0         | 0        | red             |
| 1       | 0         | 1        | magenta         |
| 1       | 1         | 0        | yellow          |
| 1       | 1         | 1        | white           |

**Table 12.1** Three-bit VGA color combinations

The monitor's internal oscillators and amplifiers generate sawtooth waveforms to control the two deflection coils. For example, the electron beam moves from the left edge to the right edge as the voltage applied to the horizontal deflection coil gradually increases. After reaching the right edge, the beam returns rapidly to the left edge (i.e., retraces) when the voltage changes to 0. The relationship between the sawtooth waveform and the scan is shown in Figure 12.4. Two external synchronization signals, hsync and vsync, control generation of the sawtooth waveforms. These signals are digital signals. The relationship between the hsync signal and the horizontal sawtooth is also shown in Figure 12.4. Note that the "1" and "0" periods of the hsync signal correspond to the rising and falling ramps of the sawtooth waveform.

The basic operation of a color CRT is similar except that it has three electron beams, which are projected to the red, green, and blue phosphor dots on the screen. The three dots are combined to form a pixel. We can adjust the voltage levels of the three video input signals to obtain the desired pixel color.

## 12.1.2 VGA port of the S3 board

The VGA port has five active signals, including the horizontal and vertical synchronization signals, hsync and vsync, and three video signals for the red, green, and blue beams. It is physically connected to a 15-pin D-subminiature connector. A video signal is an analog signal and the video controller uses a digital-to-analog converter to convert the digital output to the desired analog level. If a video signal is represented by an N-bit word, it can be converted to  $2^N$  analog levels. The three video signals can generate  $2^{3N}$  different colors. This is also known as 3N-bit color since a color is defined by 3N bits. In the S3 board, 1-bit word is used for each video signal, and this leads to only eight (i.e.,  $2^3$ ) possible colors. The possible color combinations are shown in Table 12.1. If we use the same 1-bit signal to drive the video signals, they become either "000" or "111" and the monitor functions as a black-and-white monochrome monitor.

#### 12.1.3 Video controller

A video controller generates the synchronization signals and outputs data pixels serially. A simplified block diagram of a VGA controller is shown in Figure 12.3. It contains a synchronization circuit, labeled vga\_sync, and a pixel generation circuit.



Figure 12.3 Simplified block diagram of a VGA controller.

The vga\_sync circuit generates the timing and synchronization signals. The hsync and vsync signals are connected to the VGA port to control the horizontal and vertical scans of the monitor. The two signals are decoded from the internal counters, whose outputs are the pixel\_x and pixel\_y signals. The pixel\_x and pixel\_y signals indicate the relative positions of the scans and essentially specify the location of the current pixel. The vga\_sync circuit also generates the video\_on signal to indicate whether to enable or disable the display. The design of this circuit is discussed in Section 12.2.

The pixel generation circuit generates the three video signals, which are collectively referred to as the rgb signal. A color value is obtained according to the current coordinates of the pixel (the pixel\_x and pixel\_y signals) and the external control and data signals. This circuit is more involved and is discussed in the second half of this chapter and Chapter 13.

#### 12.2 VGA SYNCHRONIZATION

The video synchronization circuit generates the hsync signal, which specifies the required time to traverse (scan) a row, and the vsync signal, which specifies the required time to traverse (scan) the entire screen. Subsequent discussions are based on a 640-by-480 VGA screen with a 25-MHz *pixel rate*, which means that 25M pixels are processed in a second. Note that this resolution is also know as the *VGA mode*.

The screen of a CRT monitor usually includes a small black border, as shown at the top of Figure 12.4. The middle rectangle is the visible portion. Note that the coordinate of the vertical axis increases downward. The coordinates of the top-left and bottom-right corners are (0,0) and (639,479), respectively.

## 12.2.1 Horizontal synchronization

A detailed timing diagram of one horizontal scan is shown in Figure 12.4. A period of the hsync signal contains 800 pixels and can be divided into four regions:



Figure 12.4 Timing diagram of a horizontal scan.



Figure 12.5 Timing diagram of a vertical scan.

- Display: region where the pixels are actually displayed on the screen. The length of this region is 640 pixels.
- Retrace: region in which the electron beams return to the left edge. The video signal should be disabled (i.e., black), and the length of this region is 96 pixels.
- Right border: region that forms the right border of the display region. It is also know as the *front porch* (i.e., porch before retrace). The video signal should be disabled, and the length of this region is 16 pixels.
- Left border: region that forms the left border of the display region. It is also know as the back porch (i.e., porch after retrace). The video signal should be disabled, and the length of this region is 48 pixels.

Note that the lengths of the right and left borders may vary for different brands of monitors. The hsync signal can be obtained by a special mod-800 counter and a decoding circuit. The counts are marked on the top of the hsync signal in Figure 12.4. We intentionally start the counting from the beginning of the display region. This allows us to use the counter output as the horizontal (x-axis) coordinate. This output constitutes the pixel\_x signal. The hsync signal goes low when the counter's output is between 656 and 751.

Note that the CRT monitor should be black in the right and left borders and during retrace. We use the h\_video\_on signal to indicate whether the current horizontal coordinate is in the displayable region. It is asserted only when the pixel count is smaller than 640.

## 12.2.2 Vertical synchronization

During the vertical scan, the electron beams move gradually from top to bottom and then return to the top. This corresponds to the time required to refresh the entire screen. The format of the vsync signal is similar to that of the hsync signal, as shown in Figure 12.5. The time unit of the movement is represented in terms of horizontal scan lines. A period of the vsync signal is 525 lines and can be divided into four regions:

 Display: region where the horizontal lines are actually displayed on the screen. The length of this region is 480 lines.

- Retrace: region that the electron beams return to the top of the screen. The video signal should be disabled, and the length of this region is 2 lines.
- Bottom border: region that forms the bottom border of the display region. It is also know as the *front porch* (i.e., porch before retrace). The video signal should be disabled, and the length of this region is 10 lines.
- *Top border*: region that forms the top border of the display region. It is also know as the *back porch* (i.e., porch after retrace). The video signal should be disabled, and the length of this region is 33 lines.

As in the horizontal scan, the lengths of the top and bottom borders may vary for different brands of monitors.

The vsync signal can be obtained by a special mod-525 counter and a decoding circuit. Again, we intentionally start counting from the beginning of the display region. This allows us to use the counter output as the vertical (y-axis) coordinate. This output constitutes the pixel\_y signal. The vsync signal goes low when the line count is 490 or 491.

As in the horizontal scan, we use the v\_video\_on signal to indicate whether the current vertical coordinate is in the displayable region. It is asserted only when the line count is smaller than 480.

## 12.2.3 Timing calculation of VGA synchronization signals

As mentioned earlier, we assume that the pixel rate is 25 MHz. It is determined by three parameters:

• p: the number of pixels in a horizontal scan line. For 640-by-480 resolution, it is

$$p = 800 \frac{pixels}{line}$$

• *l*: the number of lines in a screen (i.e., a vertical scan). For 640-by-480 resolution, it is

$$l = 525 \frac{lines}{screen}$$

• s: the number of screens per second. For flickering-free operation, we can set it to

$$s = 60 \frac{screens}{second}$$

The s parameter specifies how fast the screen should be refreshed. For a human eye, the refresh rate must be at least 30 screens per second to make the motion appear to be continuous. To reduce flickering, the monitor usually has a much higher rate, such as the 60 screens per second specification above. The pixel rate can be calculated by the three parameters:

$$\text{pixel rate} = p*l*s \approx 25M \; \frac{pixels}{second}$$

The pixel rate for other resolutions and refresh rates can be calculated in a similar fashion. Clearly, the rate increases as the resolution and refresh rate grow.

## 12.2.4 HDL implementation

The function of the vga\_sync circuit is discussed in Section 12.1.3. If the frequency of the system clock is 25 MHz, the circuit can be implemented by two special counters: a

mod-800 counter to keep track of the horizontal scan and a mod-525 counter to keep track of the vertical scan.

Since our designs generally use the 50-MHz oscillator of the prototyping board, the system clock rate is twice the pixel rate. Instead of creating a separate 25-MHz clock domain and violating the synchronous design methodology, we can generate a 25-MHz enable tick to enable or pause the counting. The tick is also routed to the p\_tick port as an output signal to coordinate operation of the pixel generation circuit.

The HDL code is shown in Listing 12.1. It consists of a mod-2 counter to generate the 25-MHz enable tick and two counters for the horizontal and vertical scans. We use two status signals, h\_end and v\_end, to indicate completion of the horizontal and vertical scans. The values of various regions of the horizontal and vertical scans are defined as constants. They can be easily modified if a different resolution or refresh rate is used. To remove potential glitches, output buffers are inserted for the hsync and vsync signals. This leads to a one-clock-cycle delay. We should add a similar buffer for the rgb signal in the pixel generation circuit to compensate for the delay.

Listing 12.1 VGA synchronization circuit

```
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity vga_sync is
   port (
      clk, reset: in std_logic;
      hsync, vsync: out std_logic;
      video_on, p_tick: out std_logic;
      pixel_x, pixel_y: out std_logic_vector (9 downto 0)
    );
end vga_sync;
architecture arch of vga_sync is
  -- VGA 640-by-480 sync parameters
   constant HD: integer:=640; --horizontal display area
   constant HF: integer:=16; --h. front porch
   constant HB: integer:=48; --h. back porch
   constant HR: integer:=96; --h. retrace
   constant VD: integer:=480; --vertical display area
   constant VF: integer:=10; --v. front porch
   {f constant} VB: integer:=33; --v. back\ porch
   constant VR: integer:=2; --v. retrace
  -- mod-2 counter
   signal mod2_reg, mod2_next: std_logic;
  -- sync counters
   signal v_count_reg, v_count_next: unsigned(9 downto 0);
   signal h_count_reg, h_count_next: unsigned(9 downto 0);
   -- output buffer
   signal v_sync_reg, h_sync_reg: std_logic;
   signal v_sync_next, h_sync_next: std_logic;
   -- status signal
   signal h_end, v_end, pixel_tick: std_logic;
begin
   -- registers
   process (clk, reset)
```

```
begin
        if reset='1' then
           mod2_reg <= '0';
           v_count_reg <= (others=>'0');
           h_count_reg <= (others=>'0');
40
           v_sync_reg <= '0';
           h_sync_reg <= '0';
        elsif (clk'event and clk='1') then
           mod2_reg <= mod2_next;</pre>
           v_count_reg <= v_count_next;</pre>
45
           h_count_reg <= h_count_next;
           v_sync_reg <= v_sync_next;</pre>
           h_sync_reg <= h_sync_next;
        end if;
    end process;
50
    -- mod-2 circuit to generate 25 MHz enable tick
     mod2_next <= not mod2_reg;</pre>
    -- 25 MHz pixel tick
     pixel_tick <= '1' when mod2_reg='1' else '0';</pre>
    -- status
     h_end <= -- end of horizontal counter
        '1' when h_count_reg=(HD+HF+HB+HR-1) else ---799
        0':
     v_end <= -- end of vertical counter
        '1' when v_count_reg=(VD+VF+VB+VR-1) else --524
        ,0,;
    -- mod-800 horizontal sync counter
     process (h_count_reg,h_end,pixel_tick)
     begin
        if pixel_tick='1' then -- 25 MHz tick
65
           if h_end='1' then
              h_count_next <= (others=>'0');
              h_count_next <= h_count_reg + 1;</pre>
70
           end if:
        else
           h_count_next <= h_count_reg;
        end if;
     end process;
     -- mod-525 vertical sync counter
     process (v_count_reg,h_end,v_end,pixel_tick)
     begin
        if pixel_tick='1' and h_end='1' then
           if (v_end='1') then
              v_count_next <= (others=>'0');
80
              v_count_next <= v_count_reg + 1;</pre>
           end if;
           v_count_next <= v_count_reg;</pre>
        end if;
     end process;
     -- horizontal and vertical sync, buffered to avoid glitch
```

```
h_sync_next <=
        '1' when (h_count_reg>=(HD+HF))
                                                      --656
             and (h_count_reg <= (HD+HF+HR-1)) else ---751
        ,0;
     v_sync_next <=
        '1' when (v_count_reg>=(VD+VF))
             and (v_count_reg <= (VD+VF+VR-1)) else --491
95
        0';
    -- video on/off
    video_on <=
        '1' when (h_count_reg<HD) and (v_count_reg<VD) else
        ,0,;
    -- output signal
    hsync <= h_sync_reg;
    vsync <= v_sync_reg;</pre>
    pixel_x <= std_logic_vector(h_count_reg);</pre>
    pixel_y <= std_logic_vector(v_count_reg);</pre>
    p_tick <= pixel_tick;</pre>
 end arch;
```

## 12.2.5 Testing circuit

To verify operation of the synchronization circuit, we can connect the rgb signal to three switches. The entire visible region should be turned on with a single color. We can go through the eight possible combinations and check the colors defined in Table 12.1. The HDL code is shown in Listing 12.2. As mentioned in Section 12.2.4, an output buffer is added for the rgb signal.

Listing 12.2 VGA synchronization testing circuit

```
library ieee;
 use ieee.std_logic_1164.all;
 entity vga_test is
    port (
       clk, reset: in std_logic;
       sw: in std_logic_vector(2 downto 0);
       hsync, vsync: out std_logic;
       rgb: out std_logic_vector(2 downto 0)
    );
10 end vga_test;
 architecture arch of vga_test is
     signal rgb_reg: std_logic_vector(2 downto 0);
     signal video_on: std_logic;
15 begin
      - instantiate VGA sync circuit
    vga_sync_unit: entity work.vga_sync
        port map(clk=>clk, reset=>reset, hsync=>hsync,
                 vsync=>vsync, video_on=>video_on,
                 p_tick=>open , pixel_x=>open , pixel_y=>open);
20
    -- rgb buffer
     process (clk, reset)
     begin
```

## 12.3 OVERVIEW OF THE PIXEL GENERATION CIRCUIT

The pixel generation circuit generates the 3-bit rgb signal for the VGA port. The external control and data signals specify the content of the screen, and the pixel\_x and pixel\_y signals from the vga\_sync circuit provide the current coordinates of the pixel. For our discussion purposes, we divided this circuit into three broad categories:

- Bit-mapped scheme
- Tile-mapped scheme
- Object-mapped scheme

In a bit-mapped scheme, a video memory is used to store the data to be displayed on the screen. Each pixel of the screen is mapped directly to a memory word, and the pixel\_x and pixel\_y signals form the address. A graphics processing circuit continuously updates the screen and writes relevant data to the video memory. A retrieval circuit continuously reads the video memory and routes the data to the rgb signal. This is the scheme used in today's high-performance video controller. For 640-by-480 resolution, there are about 310k (i.e., 640\*480) pixels on a screen. This translates to 310k memory bits for a monochrome display and 930k memory bits (i.e., 3 bits per pixel) for a 3-bit color display. A bit-mapped example is discussed in Section 12.5.

To reduce the memory requirement, one alternative is to use a *tile-mapped scheme*. In this scheme, we group a collection of bits to form a *tile* and treat each tile as a display unit. For example, we can define an 8-by-8 square of pixels (i.e., 64 pixels) as a tile. The 640-by-480 pixel-oriented screen becomes an 80-by-60 tile-oriented screen. Only 4800 (i.e., 80\*60) words are needed for the *tile memory*. The number of bits in a word depends on the number of tile patterns. For example, if there are 32 tile patterns, each word should contain 5 bits, and the size of the tile memory is about 24k bits (i.e., 5\*4800). The tile-mapped scheme usually requires a ROM to store the tile patterns. We call it *pattern memory*. Assume that monochrome patterns are used in the previous example. Each 8-by-8 tile pattern requires 64 bits, and the entire 32 patterns need 2K (i.e., 8\*8\*32) bits. The overall memory requirement is about 26k bits, which is much smaller than the 310k bits of the bit-mapped scheme. The text display discussed in Chapter 13 is based on this scheme.

For some applications, the video display can be very simple and contains only a few objects. Instead of wasting memory to store a mostly blank screen, we can generate these objects using simple object generation circuits. We call this approach an *object-mapped scheme*. An object-mapped example is discussed in Section 12.4.

The three schemes can be mixed together to generate a full screen. For example, we can use a bit-mapped scheme to generate the background and use an object-mapped scheme to produce the main objects. We can also use a bit-mapped scheme for one portion of a screen and tile-mapped text for another part of the screen.



Figure 12.6 Conceptual diagram of object-mapped pixel generation.

#### 12.4 GRAPHIC GENERATION WITH AN OBJECT-MAPPED SCHEME

The conceptual diagram of an object-mapped pixel generation circuit that contains three objects is shown in Figure 12.6. The diagram consists of three object generation circuits and a special selecting and routing circuit, labeled rgb mux. An object generation circuit performs the following tasks:

- It keeps the coordinates of the current object and compares it with the current scan location provided by the pixel\_x and pixel\_y signals.
- If the current scan location falls within the region, it asserts the obj\_i\_on signal to indicate that the current scan location is within the region of the *i*th object and the object should be "turned on."
- It specifies the desired color in the obj\_i\_rgb signal.

The rgb mux circuit performs multiplexing according to an internal prioritizing scheme. It examines various obj\_i\_on signals and determines which obj\_i\_rgb signal is to be routed to the rgb output. The prioritizing scheme prioritizes the order of the displays when multiple obj\_i\_on signals are asserted at the same time. It corresponds to selecting an object for the foreground.

We use a simplified ping-pong-like game to illustrate the various graphic generation schemes. The design is constructed as follows:

- 1. Create a simple still screen with rectangular objects.
- 2. Add a round object.
- 3. Introduce animation.
- 4. Add text for scores and information.
- 5. Create a top-level control circuit.

The first three steps are discussed in this section, and the last two steps are discussed in Chapter 13.



Figure 12.7 Still screen of the pong game.

## 12.4.1 Rectangular objects

A rectangular object can be described by its boundary coordinates on the screen. The still screen of the game is shown in Figure 12.7. It has three objects: a wall, which is shown as a narrow stripe on the left; a paddle, which is shown as a short vertical bar on the right; and a square ball. The coordinates of the displayable area of the screen is also shown. Note that the y-axis increases downward.

Let us first examine generation of the wall stripe. For clarity, we define constants for the relevant boundaries and sizes in code. The code segment for the wall is

```
constant WALL_X_L: integer:=32;
constant WALL_X_R: integer:=35;
...
-- pixel within wall
wall_on <=
    '1' when (WALL_X_L<=pix_x) and (pix_x<=WALL_X_R) else
    '0';
-- wall rgb output
wall_rgb <= "001"; -- blue</pre>
```

The wall is a four-pixel-wide vertical stripe between columns 32 and 35, which as defined as WALL\_X\_L and WALL\_X\_R, representing the left and right x-coordinates of the wall, respectively. The object has two output signals, wall\_on and wall\_rgb. The wall\_on signal, which indicates that the wall object should be turned on, is asserted when the current horizontal scan is within its region. Since the stripe covers the entire vertical column, there is no need for the y-axis boundaries. The wall\_rgb signal indicates that the color of the wall is "001" (blue).

The code segment for the bar (paddle) is

```
-- bar left , right boundary
constant BAR_X_L: integer:=600;
constant BAR_X_R: integer:=603;
```

The code is similar to that of the wall segment except that it includes the y-axis boundaries. The desired vertical length of the bar is 72 pixels, which is defined by BAR\_Y\_SIZE. Since we wish to place the bar in the middle, the top boundary of the bar, which is BAR\_Y\_T, is one half of the maximal y-value (i.e., 480/2) minus one half of the bar length. The bottom boundary of the bar is the top boundary plus the bar length. Generation of the bar\_on signal is similar to that of the wall\_on signal except that the vertical scan must be within the bar's y-axis boundaries as well.

The code for the ball can be constructed in a similar fashion. The final code segment is the selection and multiplexing circuit, which examines the on signals of three objects and routes the corresponding rgb signal to output. The code is

```
process (video_on, wall_on, bar_on, sq_ball_on,
        wall_rgb,bar_rgb,ball_rgb)
begin
   if video_on='0' then
       graph_rgb \ll "000"; --blank
   else
      if wall_on='1' then
         graph_rgb <= wall_rgb;</pre>
      elsif bar_on='1' then
         graph_rgb <= bar_rgb;
      elsif sq_ball_on='1' then
         graph_rgb <= ball_rgb;
      else
         graph_rgb <= "110"; -- yellow background</pre>
      end if;
   end if;
end process;
```

The circuit first checks whether the video\_on is asserted, and if this is the case, examines the three on signals in turn. When an on signal is asserted, it indicates that the scan is within its region, and the corresponding rgb signal is passed to the output. If no signal is asserted, the scan is in the "background" and the output is assigned to be "110" (yellow).

The complete HDL code is shown in Listing 12.3.

Listing 12.3 Pixel-generation circuit for the pong game screen

```
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity pong_graph_st is
```

```
port (
       video_on: in std_logic;
       pixel_x,pixel_y: in std_logic_vector(9 downto 0);
       graph_rgb: out std_logic_vector(2 downto 0)
    ):
10 end pong_graph_st;
 architecture sq_ball_arch of pong_graph_st is
    --x, y coordinates (0,0) to (639,479)
    signal pix_x, pix_y: unsigned(9 downto 0);
    constant MAX_X: integer:=640;
    constant MAX_Y: integer:=480;
    -- vertical stripe as a wall
    -- wall left, right boundary
    constant WALL_X_L: integer:=32;
    constant WALL_X_R: integer:=35;
    -- right vertical bar
25
    -- bar left, right boundary
    constant BAR_X_L: integer:=600;
    constant BAR_X_R: integer:=603;
    -- bar top, bottom boundary
    constant BAR_Y_SIZE: integer:=72;
    constant BAR_Y_T: integer:=MAX_Y/2-BAR_Y_SIZE/2; --204
    constant BAR_Y_B: integer:=BAR_Y_T+BAR_Y_SIZE-1;
    -- square ball
35
    constant BALL_SIZE: integer:=8;
    -- ball left, right boundary
    constant BALL_X_L: integer:=580;
    constant BALL_X_R: integer:=BALL_X_L+BALL_SIZE-1;
    -- ball top, bottom boundary
    constant BALL_Y_T: integer:=238;
    constant BALL_Y_B: integer:=BALL_Y_T+BALL_SIZE-1;
    -- object output signals
    signal wall_on, bar_on, sq_ball_on: std_logic;
    signal wall_rgb, bar_rgb, ball_rgb:
            std_logic_vector(2 downto 0);
50 begin
    pix_x <= unsigned(pixel_x);</pre>
    pix_y <= unsigned(pixel_y);</pre>
    -- (wall) left vertical stripe
    -- pixel within wall
    wall_on <=
```

```
'1' when (WALL_X_L <= pix_x) and (pix_x <= WALL_X_R) else
        0';
    -- wall rgb output
    wall_rgb <= "001"; -- blue
    -- right vertical bar
    -- pixel within bar
     bar_on <=
        '1' when (BAR_X_L <= pix_x) and (pix_x <= BAR_X_R) and
                  (BAR_Y_T \le pix_y) and (pix_y \le BAR_Y_B) else
        ,0,;
    -- bar rgb output
70
     bar_rgb <= "010"; --green
    -- square ball
    -- pixel within squared ball
75
     sq_ball_on <=
        '1' when (BALL_X_L <= pix_x) and (pix_x <= BALL_X_R) and
                  (BALL_Y_T <= pix_y) and (pix_y <= BALL_Y_B) else
        ,0,:
     ball_rgb <= "100"; -- red
80
    -- rgb multiplexing circuit
     process(video_on, wall_on, bar_on, sq_ball_on,
             wall_rgb, bar_rgb, ball_rgb)
     begin
        if video_on='0' then
            graph_rgb <= "000"; --blank</pre>
        else
           if wall_on='1' then
90
              graph_rgb <= wall_rgb;</pre>
           elsif bar_on='1' then
              graph_rgb <= bar_rgb;
           elsif sq_ball_on='1' then
95
              graph_rgb <= ball_rgb;
           else
              graph_rgb <= "110"; -- yellow background</pre>
        end if;
     end process;
 end sq_ball_arch;
```

After deriving the pixel generation circuit, we can combine it with the VGA synchronization circuit to construct the complete video interface. The top-level HDL code is shown in Listing 12.4. Note that the graph\_rgb signal is routed to output through an output buffer. It is loaded when the pixel\_tick signal is asserted. This synchronizes the rgb output with the buffered hsync and vsync signals.

**Listing 12.4** Complete circuit for a still pong game screen

```
use ieee.std_logic_1164.all;
 entity pong_top_st is
    port (
       clk,reset: in std_logic;
       hsync, vsync: out std_logic;
       rgb: out std_logic_vector(2 downto 0)
    );
 end pong_top_st;
 architecture arch of pong_top_st is
    signal pixel_x, pixel_y: std_logic_vector (9 downto 0);
    signal video_on, pixel_tick: std_logic;
    signal rgb_reg, rgb_next: std_logic_vector(2 downto 0);
15 begin
    -- instantiate VGA sync
    vga_sync_unit: entity work.vga_sync
        port map(clk=>clk, reset=>reset,
                 video_on=>video_on, p_tick=>pixel_tick,
                 hsync=>hsync, vsync=>vsync,
20
                 pixel_x=>pixel_x, pixel_y=>pixel_y);
    -- instantiate graphic generator
    pong_grf_st_unit: entity work.pong_graph_st(sq_ball_arch)
        port map (video_on=>video_on,
                  pixel_x=>pixel_x, pixel_y=>pixel_y,
25
                  graph_rgb=>rgb_next);
    -- rgb buffer
    process (clk)
    begin
        if (clk'event and clk='1') then
30
           if (pixel_tick='1') then
              rgb_reg <= rgb_next;
           end if;
       end if;
    end process;
    rgb <= rgb_reg;
 end arch:
```

#### 12.4.2 Non-rectangular object

Direct checking of the boundaries of a non-rectangular object is very difficult. An alternative is to specify the object pattern in a bit map and generate the rgb and on signals according to the map. This can best be explained by an example. Assume that we want to have a round ball in the pong game screen. The bit map of a circle within an 8-by-8 pixel square is shown in Figure 12.8. The circle object can be generated as follows:

- Check whether the scan coordinates are within the 8-by-8 pixel square.
- If this is the case, obtain the corresponding pixel from the bit map.
- Use the retrieved bit to generate the rgb and on signals for the circle object.

To implement this scheme, we need to include a *pattern ROM* to store the bit map and an address mapping circuit to convert the scan coordinates to the ROM's row and column.

To accommodate the change, the ball portion from Listing 12.3 must be modified. First, we define a pattern ROM for the circle. It can be done by declaring a two-dimensional



Figure 12.8 Bit map of a circle.

constant, as in the ROM template of Listing 11.5. To facilitate future animation, we also use signals to replace constants for the square ball boundaries. The revised architecture declaration portion becomes

```
constant BALL_SIZE: integer:=8;
-- ball left, right boundary
signal ball_x_1, ball_x_r: unsigned(9 downto 0);
-- ball top, bottom boundary
signal ball_y_t, ball_y_b: unsigned(9 downto 0);
-- round ball image ROM
type rom_type is array (0 to 7) of std_logic_vector(0 to 7);
-- ROM definition
constant BALL_ROM: rom_type :=
   "00111100", --
  "01111110", --
                *****
  "11111111", -- ******
  "11111111", -- ******
  "11111111", — ******
  "11111111", -- ******
  "01111110", -- *****
  "00111100" ---
signal rom_addr, rom_col: unsigned(2 downto 0);
signal rom_data: std_logic_vector(7 downto 0);
signal rom_bit: std_logic;
-- new signal to indicate whether the scan coordinates
-- are within the round ball region
signal rd_ball_on: std_logic;
```

Second, we expand the ball generation segment to include the mapping of the circle bit map:

```
rom_data <= BALL_ROM(to_integer(rom_addr));
rom_bit <= rom_data(to_integer(rom_col));
rd_ball_on <=
    '1' when (sq_ball_on='1') and (rom_bit='1') else
    '0';
-- ball rgb output
ball_rgb <= "100"; -- red</pre>
```

The first statement checks whether the current scan coordinates are within the square ball region and asserts the sq\_ball\_on signal accordingly. This part is the same as Listing 12.3 except that signals are used for boundaries. The second part obtains the corresponding ROM bit according to the current scan coordinates. If the scan coordinates are within the square ball region, subtracting the three LSBs from the top boundary (i.e., ball\_y\_t) provides the corresponding ROM row (i.e., rom\_addr), and subtracting the three LSBs from the left boundary (i.e., ball\_x\_l) provides the corresponding ROM column (i.e., rom\_col). The bit can then be retrieved by two indexing operations. It is then combined with the sq\_ball\_on signal to generate the rd\_ball\_on signal. This design just assigns a monochrome color (i.e., "100" red) for the round ball region. We can duplicate the pattern ROM three times to store the rgb value for each pixel and generate a multiple-color ball.

Finally, we need to make a minor modification in the multiplexing circuit to substitute the sq\_ball\_on signal with the rd\_ball\_on signal:

```
process ...
    elsif rd_ball_on='1' then
        graph_rgb <= ball_rgb;
...
end process;</pre>
```

These modifications are incorporated into the animated graph in the next subsection.

## 12.4.3 Animated object

When an object changes its location gradually in each scan, it creates the illusion of motion and becomes *animated*. To achieve this, we can use registers to store the boundaries of an object and update its value in each scan. In the pong game, the paddle is controlled by two pushbuttons and can move up and down, and the ball can move and bounce in all directions. We illustrate how to create animation for these two objects in this subsection.

While the VGA controller is driven by a 25-MHz pixel rate, the screen of the VGA monitor is refreshed only 60 times per second. The boundary registers only need to be updated at this rate. We create a 60-Hz enable tick, refr\_tick, which is asserted one clock cycle every  $\frac{1}{60}$  second.

Let us first examine the design of the paddle. To accommodate the changing y-axis coordinates, we replace the constants with two signals, bar\_y\_t and bar\_y\_b, to represent the top and bottom boundaries, and create a register, bar\_y\_reg, to store the current y-axis location of the top boundary. If one of the pushbuttons is pressed, bar\_y\_reg either increases or decreases a fixed amount when the refr\_tick signal is asserted. The amount is defined by a constant, BAR\_V, which stands for the bar velocity. We assume that assertion of the btn(1) and btn(0) signals causes the paddle to move up and down, respectively, and that the paddle stops moving when it reaches the top or the bottom of the screen. The code segment for updating bar\_y\_reg is

end process;

```
-- new bar y-position
process(bar_y_reg, bar_y_b, bar_y_t, refr_tick, btn)
begin
    bar_y_next <= bar_y_reg; -- default, no move
    if refr_tick='1' then
        if btn(1)='1' and bar_y_b<(MAX_Y-1-BAR_V) then
        -- button 1 asserted and bar not reach bottom yet
            bar_y_next <= bar_y_reg + BAR_V; -- move down
        elsif btn(0)='1' and bar_y_t > BAR_V then
        -- button 0 asserted and bar not reach top yet
            bar_y_next <= bar_y_reg - BAR_V; -- move up
        end if;
end process;</pre>
```

The design of the ball is more involved. We have to replace the four boundary constants with four signals and create two registers, ball\_x\_reg and ball\_y\_reg, to store the current x- and y-axis coordinates of the left and top boundaries. The ball usually moves at a constant velocity (i.e., at a constant speed and in the same direction). It may change direction when hitting the wall, the paddle, or the bottom or top of the screen. We decompose the velocity into an x-component and a y-component, whose values can be either a positive constant value, BALL\_V\_P, or a negative constant value, BALL\_V\_N. The current values of the two components are stored in the x\_delta\_reg and y\_delta\_reg registers. The code segment for updating ball\_x\_reg and ball\_y\_reg is

```
-- new ball position
      ball_x_next <=
         ball_x_reg + x_delta_reg when refr_tick='1' else
         ball_x_reg ;
      ball_y_next <=
         ball_y_reg + y_delta_reg when refr_tick='1' else
         ball_y_reg ;
and the code segment for updating x_delta_reg and y_delta_reg is
      -- new ball velocity
      process(x_delta_reg,y_delta_reg,ball_y_t,ball_x_l,ball_x_r,
               ball_y_t,ball_y_b,bar_y_t,bar_y_b)
      begin
         x_delta_next <= x_delta_reg; --default , no change</pre>
         y_delta_next <= y_delta_reg; --default, no change</pre>
         if ball_y_t < 1 then -- reach top
             y_delta_next <= BALL_V_P; --down</pre>
         elsif ball_y_b > (MAX_Y-1) then -- reach bottom
            y_delta_next <= BALL_V_N; --up</pre>
         elsif ball_x_1 <= WALL_X_R then -- reach wall
             x_delta_next <= BALL_V_P; --bounce back (to right)</pre>
         elsif (BAR_X_L \le ball_x_r) and (ball_x_r \le BAR_X_R) then
            -- reach x-coordinate of bar
             if (bar_y_t <= ball_y_b) and (ball_y_t <= bar_y_b) then</pre>
               -- within y-range of bar, hit
                x_delta_next <= BALL_V_N; --bounce back (to left)</pre>
            end if;
         end if;
```

Note that if the paddle bar misses the ball, the ball continues moving to right and eventually wraps around.

The complete code is shown in Listing 12.5.

Listing 12.5 Pixel-generation circuit for the animated pong game

```
library ieee;
 use ieee.std_logic_1164.all;
 use ieee.numeric_std.all;
 entity pong_graph_animate is
    port (
         clk, reset: std_logic;
         btn: std_logic_vector(1 downto 0);
         video_on: in std_logic;
         pixel_x,pixel_y: in std_logic_vector(9 downto 0);
         graph_rgb: out std_logic_vector(2 downto 0)
10
    );
 end pong_graph_animate;
 architecture arch of pong_graph_animate is
    signal refr_tick: std_logic;
    --x, y coordinates (0,0) to (639,479)
    signal pix_x, pix_y: unsigned(9 downto 0);
    constant MAX_X: integer:=640;
    constant MAX_Y: integer:=480;
    -- vertical stripe as a wall
    -- wall left, right boundary
    constant WALL_X_L: integer:=32;
    constant WALL_X_R: integer:=35;
25
    -- right paddle bar
    -- bar left, right boundary
    constant BAR_X_L: integer:=600;
    constant BAR_X_R: integer:=603;
    -- bar top, bottom boundary
    signal bar_y_t, bar_y_b: unsigned(9 downto 0);
    constant BAR_Y_SIZE: integer:=72;
    -- reg to track top boundary (x position is fixed)
35
    signal bar_y_reg, bar_y_next: unsigned(9 downto 0);
    -- bar moving velocity when a button is pressed
    constant BAR_V: integer:=4;
    -- square ball
40
    constant BALL_SIZE: integer:=8; -- 8
    - ball left, right boundary
    signal ball_x_l, ball_x_r: unsigned(9 downto 0);
    -- ball top, bottom boundary
    signal ball_y_t, ball_y_b: unsigned(9 downto 0);
    -- reg to track left, top boundary
    signal ball_x_reg, ball_x_next: unsigned(9 downto 0);
```

```
signal ball_y_reg, ball_y_next: unsigned(9 downto 0);
    -- reg to track ball speed
    signal x_delta_reg, x_delta_next: unsigned(9 downto 0);
    signal y_delta_reg , y_delta_next: unsigned(9 downto 0);
    -- ball velocity can be pos or neg
    constant BALL_V_P: unsigned(9 downto 0)
              :=to_unsigned(2,10);
55
    constant BALL_V_N: unsigned(9 downto 0)
              :=unsigned(to_signed(-2,10));
     -- round ball image ROM
     type rom_type is array (0 to 7)
          of std_logic_vector(0 to 7);
    -- ROM definition
     constant BALL_ROM: rom_type :=
65
        "00111100", ---
        "01111110", -- *****
        "11111111", --- ******
        "11111111", -- ******
        "11111111", -- ******
70
        "11111111", -- ******
        "01111110", --- *****
        "00111100" --
                         ****
     signal rom_addr, rom_col: unsigned(2 downto 0);
75
     signal rom_data: std_logic_vector(7 downto 0);
     signal rom_bit: std_logic;
     -- object output signals
     signal wall_on, bar_on, sq_ball_on, rd_ball_on: std_logic;
     signal wall_rgb, bar_rgb, ball_rgb:
            std_logic_vector(2 downto 0);
  begin
     -- registers
     process (clk, reset)
     begin
        if reset='1' then
           bar_y_reg <= (others=>'0');
           ball_x_reg <= (others=>'0');
90
           ball_y_reg <= (others=>'0');
           x_delta_reg <= ("000000100");</pre>
           y_delta_reg <= ("000000100");</pre>
        elsif (clk'event and clk='1') then
           bar_y_reg <= bar_y_next;
95
           ball_x_reg <= ball_x_next;
           ball_y_reg <= ball_y_next;
            x_delta_reg <= x_delta_next;</pre>
           y_delta_reg <= y_delta_next;</pre>
        end if;
100
     end process;
```

```
pix_x <= unsigned(pixel_x);</pre>
     pix_y <= unsigned(pixel_y);</pre>
     -- refr_tick: 1-clock tick asserted at start of v-sync
               i.e., when the screen is refreshed (60 Hz)
105
     refr_tick <= '1' when (pix_y=481) and (pix_x=0) else
                    ,0,:
     -- (wall) left vertical stripe
110
     -- pixel within wall
     wall_on <=
        '1' when (WALL_X_L <= pix_x) and (pix_x <= WALL_X_R) else
         0':
     -- wall rgb output
     wall_rgb <= "001"; -- blue
     -- right vertical bar
     -- boundary
120
     bar_y_t <= bar_y_reg;</pre>
     bar_y_b <= bar_y_t + BAR_Y_SIZE - 1;</pre>
     -- pixel within bar
     bar_on <=
        '1' when (BAR_X_L <= pix_x) and (pix_x <= BAR_X_R) and
                  (bar_y_t <= pix_y) and (pix_y <= bar_y_b) else
         ,0,:
     -- bar rgb output
     bar_rgb <= "010"; --green
130
     -- new bar y-position
     process(bar_y_reg,bar_y_b,bar_y_t,refr_tick,btn)
     begin
        bar_y_next <= bar_y_reg; --- no move
        if refr_tick='1' then
            if btn(1)='1' and bar_y_b<(MAX_Y-1-BAR_V) then
135
               bar_y_next <= bar_y_reg + BAR_V; --- move down</pre>
            elsif btn(0)='1' and bar_y_t > BAR_V then
               bar_y_next <= bar_y_reg - BAR_V; -- move up</pre>
            end if;
        end if;
     end process;
     -- square ball
145
     -- boundary
     ball_x_1 <= ball_x_reg;
     ball_y_t <= ball_y_reg;</pre>
     ball_x_r <= ball_x_l + BALL_SIZE - 1;</pre>
150
     ball_y_b <= ball_y_t + BALL_SIZE - 1;</pre>
     -- pixel within ball
     sq_ball_on <=
        '1' when (ball_x_l <= pix_x) and (pix_x <= ball_x_r) and
                  (ball_y_t <= pix_y) and (pix_y <= ball_y_b) else
```

```
0':
155
     -- map current pixel location to ROM addr/col
     rom_addr <= pix_y(2 downto 0) - ball_y_t(2 downto 0);</pre>
     rom_col <= pix_x(2 downto 0) - ball_x_1(2 downto 0);</pre>
     rom_data <= BALL_ROM(to_integer(rom_addr));</pre>
     rom_bit <= rom_data(to_integer(rom_col));
160
     -- pixel within ball
     rd_ball_on <=
         '1' when (sq_ball_on='1') and (rom_bit='1') else
         0';
     -- ball rgb output
     ball_rgb <= "100";
     -- new ball position
     ball_x_next <= ball_x_reg + x_delta_reg
                         when refr_tick='1' else
                     ball_x_reg ;
170
     ball_y_next <= ball_y_reg + y_delta_reg
                         when refr_tick='1' else
                     ball_y_reg ;
     -- new ball velocity
     process(x_delta_reg,y_delta_reg,ball_y_t,ball_x_l,ball_x_r,
              ball_y_t,ball_y_b,bar_y_t,bar_y_b)
     begin
        x_delta_next <= x_delta_reg;</pre>
        y_delta_next <= y_delta_reg;</pre>
         if ball_y_t < 1 then -- reach top
180
            y_delta_next <= BALL_V_P;</pre>
         elsif ball_y_b > (MAX_Y-1) then
                                           -- reach bottom
            y_delta_next <= BALL_V_N;</pre>
         elsif ball_x_l <= WALL_X_R then -- reach wall</pre>
                                            -- bounce back
            x_delta_next <= BALL_V_P;</pre>
         elsif (BAR_X_L <= ball_x_r) and (ball_x_r <= BAR_X_R) then
            -- reach x of right bar
            if (bar_y_t \le ball_y_b) and (ball_y_t \le bar_y_b) then
               x_delta_next <= BALL_V_N; --hit, bounce back</pre>
            end if:
190
        end if;
     end process;
     -- rgb multiplexing circuit
195
     process(video_on, wall_on, bar_on, rd_ball_on,
              wall_rgb , bar_rgb , ball_rgb)
     begin
         if video_on='0' then
             graph_rgb <= "000"; --blank
200
         else
            if wall_on='1' then
               graph_rgb <= wall_rgb;
            elsif bar_on='1' then
               graph_rgb <= bar_rgb;
            elsif rd_ball_on='1' then
               graph_rgb <= ball_rgb;
```

As in the still screen, we can combine the synchronization circuit and create the top-level description. The HDL code is shown in Listing 12.6.

Listing 12.6 Complete circuit for the animated pong game screen

```
library ieee;
 use ieee.std_logic_1164.all;
 entity pong_top_an is
    port (
       clk,reset: in std_logic;
       btn: in std_logic_vector (1 downto 0);
       hsync, vsync: out std_logic;
       rgb: out std_logic_vector(2 downto 0)
    );
mend pong_top_an;
 architecture arch of pong_top_an is
    signal pixel_x, pixel_y: std_logic_vector (9 downto 0);
    signal video_on, pixel_tick: std_logic;
    signal rgb_reg, rgb_next: std_logic_vector(2 downto 0);
 begin
    -- instantiate VGA sync
    vga_sync_unit: entity work.vga_sync
       port map(clk=>clk, reset=>reset,
                 video_on=>video_on, p_tick=>pixel_tick,
20
                 hsync=>hsync, vsync=>vsync,
                 pixel_x=>pixel_x, pixel_y=>pixel_y);
    -- instantiate graphic generator
    pong_graph_an_unit: entity work.pong_graph_animate
       port map (clk=>clk, reset=>reset,
                  btn=>btn, video_on=>video_on,
                  pixel_x=>pixel_x, pixel_y=>pixel_y,
                  graph_rgb=>rgb_next);
    -- rgb buffer
    process (clk)
    begin
        if (clk'event and clk='1') then
           if (pixel_tick='1') then
              rgb_reg <= rgb_next;
           end if:
35
       end if;
    end process;
    rgb <= rgb_reg;
 end arch;
```

Note that there is no other control mechanism is this code. The ball simply moves and bounces continuously. A top-level control circuit is discussed in Chapter 13.



Figure 12.9 Dot trace shown in a 128-by-128 bit map.

#### 12.5 GRAPHIC GENERATION WITH A BIT-MAPPED SCHEME

The bit-mapped scheme maps each pixel to a word in video memory. There are about 310k pixels in a 640-by-480 screen. This translates to 310k and 930k bits for monochrome and color displays, respectively. The actual size of the video memory can be much larger since the memory address must be properly aligned for fast access. For example, to map the pixel's current coordinates to a memory location, we can concatenate the pixel's x-coordinate, which is 10 bits (i.e.,  $\lceil \log_2(640) \rceil$ ), and the pixel's y-coordinate, which is 9 bits (i.e.,  $\lceil \log_2(480) \rceil$ ). This approach requires no additional circuit to translating the pixel's coordinates to a memory address but introduces some unused "holes" in memory. The memory size is increased from 310k words to 512K (i.e.,  $2^{10+9}$ ) words.

For the S3 board, memory is available from the external SRAM chips and FPGA's embedded block RAMs, as discussed in Chapters 10 and 11. Recall that the total capacity of the Spartan 3S200 device's block RAM is only about 192K bits. It is not large enough for a full-screen bit-mapped display. We must use the external SRAM, which is 8M bits, for this purpose.

In this section, we use a small 128-by-128 ( $2^7$ -by- $2^7$ ) area of the screen to illustrate the design of the bit-mapped scheme. The screen has 16K ( $2^{14}$ ) pixels in this area and requires a 16K-by-3 video memory for color display. This can be implemented by three embedded block RAMs. The small area is at the top-left corner of the screen and displays the trace of a bouncing one-pixel dot, as shown in Figure 12.9. The circuit uses a 3-bit switch to specify the color of the trace and a pushbutton switch to randomly select the origin of the trace. When the pushbutton switch is pressed, the dot starts to move, like the bouncing ball in Section 12.4.3. The trace forms a rectangle after the dot hits the four sides of the small area. A new trace is generated each time the pushbutton switch is pressed.

#### 12.5.1 Dual-port RAM implementation

A conceptual block diagram of this circuit is shown in Figure 12.10. The video memory is a



Figure 12.10 Conceptual block diagram of a dot trace circuit.

synchronous 16K-by-3 (i.e.,  $2^{14}$ -by-3) dual-port RAM. The dual-port module discussed in Listing 11.4 can be used for this purpose. The seven LSBs of the pixel's y-coordinate form the seven MSBs of the memory address, and the seven LSBs of the pixel's x-coordinate form the seven LSBs of the memory address. The dot\_xy circuit keeps track of the current location of the dot and generates its current y- and x-coordinates, which are concatenated as the write address. The 3-bit external switch input, sw, is the rgb value, which is connected to the memory's din\_a port. The seven LSBs of pixel\_y and the seven LSBs of pixel\_x form the read address. The data is retrieved continuously and the corresponding readout is routed to the rgb multiplexing circuit.

The complete code of the dot trace pixel generation circuit is shown in Listing 12.7. We use two registers, dot\_x\_reg and dot\_y\_reg, to keep track of the dot's current x- and y-coordinates and use two registers, v\_x\_reg and v\_y\_reg, to keep track of the current horizontal and vertical velocities. Computation of the dot's coordinates and velocities is similar to that of the bouncing ball in Section 12.4.3. In addition to regular updates, the dot\_x\_next and dot\_y\_next signals obtain the values of the seven LSBs of pix\_x and pix\_y when the pushbutton switch is pressed. Since these signals change much faster than a human's perception, the new origin appears to be random.

Listing 12.7 Pixel-generation circuit for a 128-by-128 bit map

```
is architecture dual_port_ram_arch of bitmap_gen is
    signal pix_x, pix_y: unsigned(9 downto 0);
    signal refr_tick: std_logic;
    signal load_tick: std_logic;
    -- video sram
20
    signal we: std_logic;
    signal addr_r, addr_w: std_logic_vector(13 downto 0);
    signal din, dout: std_logic_vector(2 downto 0);
25
    -- dot location and velocity
    constant MAX_X: integer:=128;
    constant MAX_Y: integer:=128;
    -- dot velocity can be pos or neg
    constant DOT_V_P: unsigned(6 downto 0)
              :=to_unsigned(1,7);
    constant DOT_V_N: unsigned(6 downto 0)
              :=unsigned(to_signed(-1,7));
    -- reg to keep track of dot location
    signal dot_x_reg, dot_x_next: unsigned(6 downto 0);
    signal dot_y_reg, dot_y_next: unsigned(6 downto 0);
    -- reg to keep track of dot velocity
    signal v_x_reg, v_x_next: unsigned(6 downto 0);
    signal v_y_reg, v_y_next: unsigned(6 downto 0);
    -- object output signals
     signal bitmap_on: std_logic;
    signal bitmap_rgb: std_logic_vector(2 downto 0);
 begin
     — instantiate debounce circuit for a button
     debounce_unit: entity work.debounce
        port map(clk=>clk, reset=>reset, sw=>btn(0),
                 db_level=>open, db_tick=>load_tick);
50
    -- instantiate dual-port video RAM (2^12 - by - 7)
    video_ram: entity work.xilinx_dual_port_ram_sync
        generic map(ADDR_WIDTH=>14, DATA_WIDTH=>3)
        port map(clk=>clk, we=>we,
                 addr_a=>addr_w, addr_b=>addr_r,
55
                 din_a=>din, dout_a=>open, dout_b=>dout);
    -- video ram interface
    addr_w <= std_logic_vector(dot_y_reg & dot_x_reg);</pre>
     addr_r <=
        std_logic_vector(pix_y(6 downto 0) & pix_x(6 downto 0));
60
    we <= '1';
    din <= sw;
    bitmap_rgb <= dout;
    -- registers
65
     process (clk,reset)
     begin
        if reset='1' then
```

```
dot_x_reg <= (others=>'0');
            dot_y_reg <= (others=>'0');
70
            v_x_reg <= DOT_V_P;</pre>
            v_y_reg <= DOT_V_P;</pre>
         elsif (clk'event and clk='1') then
            dot_x_reg <= dot_x_next;</pre>
            dot_y_reg <= dot_y_next;</pre>
75
            v_x_reg <= v_x_next;</pre>
            v_y_reg <= v_y_next;</pre>
        end if;
     end process;
     -- misc. signals
     pix_x <= unsigned(pixel_x);
80
     pix_y <= unsigned(pixel_y);</pre>
     refr_tick <= '1' when (pix_y=481) and (pix_x=0) else
                    ,0,;
     -- pixel within bit map area
     bitmap_on <=
85
         '1' when (pix_x <= 127) and (pix_y <= 127) else
         0':
     -- dot position
     -- "randomly" load dot location when btn(0) pressed
     dot_x_next <=
         pix_x(6 downto 0) when load_tick='1' else
          dot_x_reg + v_x_reg when refr_tick='1' else
         dot_x_reg ;
     dot_y_next <=
         pix_y(6 downto 0) when load_tick='1' else
95
         dot_y_reg + v_y_reg when refr_tick='1' else
     dot_y_reg ;
     -- dot x velocity
     process(v_x_reg,dot_x_reg)
     begin
        v_x_next <= v_x_reg;</pre>
         if dot_x_reg =1 then
                                          -- reach left
            v_x_next <= DOT_V_P;</pre>
                                          -- bounce back
         elsif dot_x_reg=(MAX_X-2) then -- reach right
            v_x_next <= DOT_V_N;</pre>
                                          -- bounce back
105
        end if;
     end process;
     -- dot y velocity
     process(v_y_reg,dot_y_reg)
     begin
110
        v_y_next <= v_y_reg;</pre>
        if dot_y_reg =1 then
                                            -- reach top
            v_y_next <= DOT_V_P;</pre>
         elsif dot_y_reg = (MAX_Y-2) then -- reach bottom
            v_y_next <= DOT_V_N;</pre>
        end if;
     end process;
     -- rgb multiplexing circuit
     process(video_on,bitmap_on,bitmap_rgb)
     begin
120
```

The HDL code for the top-level system is shown in Listing 12.8.

Listing 12.8 Complete circuit for a bit-mapped screen

```
library ieee;
 use ieee.std_logic_1164.all;
 entity dot_top is
    port (
       clk, reset: in std_logic;
       btn: in std_logic_vector (1 downto 0);
       sw: in std_logic_vector (2 downto 0);
       hsync, vsync: out std_logic;
       rgb: out std_logic_vector(2 downto 0)
    );
 end dot_top;
 architecture arch of dot_top is
    signal pixel_x , pixel_y: std_logic_vector(9 downto 0);
    signal video_on, pixel_tick: std_logic;
    signal rgb_reg, rgb_next: std_logic_vector(2 downto 0);
 begin
     — instantiate VGA sync circuit
    vga_sync_unit: entity work.vga_sync
       port map(clk=>clk, reset=>reset,
20
                 hsync=>hsync, vsync=>vsync,
                 video_on=>video_on, p_tick=>pixel_tick,
                 pixel_x=>pixel_x, pixel_y=>pixel_y);
    -- instantiate bit-mapped pixel generator
    bitmap_unit: entity work.bitmap_gen
       port map(clk=>clk, reset=>reset, btn=>btn, sw=>sw,
                 video_on=>video_on, pixel_x=>pixel_x,
                 pixel_y=>pixel_y, bit_rgb=>rgb_next);
    -- rgb buffer
    process (clk)
    begin
        if (clk'event and clk='1') then
           if (pixel_tick='1') then
              rgb_reg <= rgb_next;
           end if;
35
       end if;
    end process;
    rgb <= rgb_reg;
```

## 12.5.2 Single-port RAM implementation

Although a dual-port memory is ideal, it is not always available. Using regular single-port memory, such as the S3 board's external SRAM, for the video memory requires careful coordination between the write and read operations to avoid interruption on data retrieval. For demonstration purposes, we configure the embedded block RAM as a single-port synchronous SRAM and redesign the previous dot trace circuit.

In the dot trace circuit, the dot's coordinates are updated once every screen scan. Thus, the video memory can be written at this rate as well. We can do this during the vertical retrace since the video is off in this period and writing video memory does not interfere with the screen data retrieval. Note that the refr\_tick signal is asserted when pixel\_y is 481. The video is off in this location, and writing video memory will not interfere with the screen data retrieval. We use this signal as the write enable signal, we, for the single-port RAM. The single-port RAM module discussed in Listing 11.2 can be used for this purpose. The memory portion of Listing 12.7 now becomes

The dot trace circuit updates one pixel in a screen scan. The required memory bandwidth for writing is 60\*3 bits per second, which is rather low. Thus, the previous design is fairly straightforward. The design of memory interface becomes much more difficult when a large memory bandwidth is required (i.e., when a large portion of the screen is updated at a rapid rate).

#### 12.6 BIBLIOGRAPHIC NOTES

Rapid Prototyping of Digital Systems by James O. Hamblen et al. contains timing information for monitors with different resolutions and refresh rates.

#### 12.7 SUGGESTED EXPERIMENTS

#### 12.7.1 VGA test pattern generator

A VGA test pattern generator produces two simple patterns to verify operation of a VGA monitor. The first pattern divides the screen evenly into eight vertical stripes, each displaying

a unique color. The second pattern is similar but the screen is divided into eight horizontal stripes. A 1-bit switch is used to select the pattern.

Design a pixel generating circuit for this pattern generator and then combine it with the synchronization circuit in a top-level module. Synthesize and verify operation of the circuit.

## 12.7.2 SVGA mode synchronization circuit

The specification for the super VGA (SVGA) mode with 72-Hz refresh rate is

• resolution: 800-by-600 pixels

• pixel rate: 50 MHz

horizontal display region: 800 pixels
horizontal right border: 64 pixels
horizontal left border: 56 pixels
horizontal retrace: 120 pixels
vertical display region: 600 lines
vertical bottom border: 23 lines
vertical top border: 37 lines
vertical retrace: 6 lines

We wish to create a dual-mode synchronization circuit that can support both VGA and SVGA modes. The mode can be selected by a switch. Construct the circuit as follows:

- Modify the horizontal and vertical synchronization counters of Listing 12.1 to accommodate both modes.
- 2. Design a pixel-generating circuit that draws a 100-pixel grid on the screen (i.e., draw a vertical line every 100 pixels and draw a horizontal line every 100 pixels).
- 3. Derive a top-level module. Synthesize and verify operation of the two modes.

## 12.7.3 Visible screen adjustment circuit

Due to the internal timing error of a monitor, the visible portion of the screen may not always be centered. We can adjust the location of the visible portion by slightly modifying the widths surrounding black border areas. In a horizontal scan line, there are 64 pixels for the right and left border regions. To move the visible portion horizontally, we can add a certain number of pixels to one border region and subtract the same number from the opposite border region. We can adjust the visible portion vertically in a similar fashion. Design a screen adjustment circuit as follows:

- 1. Expand the VGA synchronization circuit to include this feature. Use a switch to select the vertical or horizontal mode, and use two pushbuttons to move the visible screen to left/up and right/down.
- 2. Modify the testing circuit in Section 12.2.5 to incorporate the new synchronization circuit.
- 3. Synthesize and verify operation of the circuit.

#### 12.7.4 Ball-in-a-box circuit

The ball-in-a-box circuit displays a bouncing ball inside a square box. The square box is centered on the screen and its size is 256-by-256 pixels. The ball is an 8-by-8 round ball. When the ball hits the wall, the ball bounces back and the wall flashes (i.e., changes color briefly). The ball can travel at four different speeds, which are selected by two slide



Figure 12.11 Screen of the breakout game.

switches, and its direction changes randomly when a pushbutton switch is pressed. Derive the HDL code and then synthesize and verify operation of the circuit.

#### 12.7.5 Two-balls-in-a-box circuit

We can expand the circuit in Experiment 12.7.4 to include two balls inside the box. When two balls collide, the new directions of the two balls should follow the laws of physics. Derive the HDL code and then synthesize and verify operation of the circuit.

## 12.7.6 Two-player pong game

The two-player pong game replaces the left wall with another paddle, which is controlled by the second player. To better accommodate two players, we can use the keyboard interface of Section 8.4 as the input device. Four keys can be defined to control vertical movements of the two paddles. Derive the HDL code and then synthesize and verify operation of the circuit.

## 12.7.7 Breakout game

The breakout game is a somewhat like the pong game. In this game, the left wall is replaced by several layers of "bricks." When the ball hits a brick, the ball bounces back and the brick disappears. The basic screen is shown in Figure 12.11. As in the code of Listing 12.5, we assume that the game runs continuously. Derive the HDL code and then synthesize and verify operation of the circuit.

#### 12.7.8 Full-screen dot trace

We can implement the full-screen dot trace circuit of Section 12.5 using the external SRAM chip as follows:

1. Modify the SRAM controller in Chapter 10 to configure the SRAM chip as a 2<sup>19</sup>-by-8 memory.

- 2. Follow the discussion in Section 12.5.2 to incorporate the new memory module in the circuit. Note that accessing the external memory requires two clock cycles.
- 3. Synthesize and verify operation of the circuit.

## 12.7.9 Mouse pointer circuit

The mouse interface is discussed in Section 9.5. The mouse pointer circuit uses a mouse to control the movement of a small 16-by-16 square on the screen. It functions as follows:

- The square moves according to the movement of the mouse.
- The pointer wraps around when it reaches a border.
- The pointer changes color when the left button of the mouse is pressed. It circulates through the eight colors defined in Table 12.1.

Synthesize and verify operation of the circuit.

#### 12.7.10 Small-screen mouse scribble circuit

Mouse scribble circuit keeps track of the trace of the mouse movement in a 128-by-128 screen, somewhat similar to the dot trace circuit discussed in Section 12.5. Its specification is as follows:

- The 3-bit switch determines the color of the trace.
- Clicking the left button of the mouse turns on and off the trace alternately.
- Clicking the right button of the mouse clears the screen.

Synthesize and verify operation of the circuit.

#### 12.7.11 Full-screen mouse scribble circuit

Repeat Experiment 12.7.10, but use the full screen. An external SRAM module similar to that in Experiment 12.7.8 is needed for this circuit.

## VGA CONTROLLER II: TEXT

#### 13.1 INTRODUCTION

A tile-mapped pixel generation scheme is discussed in Section 12.3. A tile can be considered as a "super pixel." Whereas a pixel is defined by a 3-bit word in a bit-mapped scheme, a tile is mapped to a predesigned pattern. One method of constructing a text display is to treat the characters as tiles and design the pixel generation circuit with the tile-mapped scheme. We discuss this method in this chapter and apply it to add scores and rules to the pong game.

#### 13.2 TEXT GENERATION

## 13.2.1 Character as a tile

When applying a tile-mapped scheme, we treat each character as a tile. In a bit-mapped scheme, the value of a pixel represents a 3-bit color. On the other hand, the value of a tile represents the code of a specific pattern. For the text display, we use the 7-bit ASCII code for the character tiles.

The patterns of the tiles constitute the *font* of the character set. A variety of fonts are available. We choose an 8-by-16 (i.e., 8-column-by-16-row) font similar to the one used in early IBM PC. In this font, each character is represented as an 8-by-16 pixel pattern. The pattern for the letter "A" is shown in Figure 13.1(a).

The character patterns are stored in a ROM and each pattern requires  $2^4 * 8$  bits. The pattern memory is known as *font ROM*. The original font set consists of 256 patterns,



**Figure 13.1** Font pattern for the letter A.

including digits, upper- and lowercase letters, punctuation symbols, and many special-purpose graphic symbols. We implement only the first half [i.e.,  $128 (2^7)$ ] of the patterns and exclude most graphic symbols. To accommodate this set,  $2^7 * 2^4 * 8$  ROM bits are needed. It is usually configured as a  $2^{11}$ -by-8 ROM.

When we use these 8-by-16 characters (i.e., tiles) in a 640-by-480 resolution screen, 80 (i.e.,  $\frac{640}{8}$ ) tiles can be fitted into a horizontal line and 30 (i.e.,  $\frac{480}{16}$ ) tiles can be fitted into a vertical line. In other words, the screen can be treated as an 80-by-25 tile screen. We can put characters on the screen using these scaled coordinates.

## 13.2.2 Font ROM

Our font set implements the 128 characters of the ASCII code, listed in Table 7.1. The 128  $(2^7)$  character patterns can be accommodated by a  $2^{11}$ -by-8 font ROM. In this ROM, the seven MSBs of the 11-bit address are used to identify the character, and the four LSBs of the address are used to identify the row within a character pattern. The address and ROM content for the letter "A" are shown in Figure 13.1(b).

In the ASCII table, the first column (ASCII codes  $00_{16}$  to  $1F_{16}$ ) are nonprintable control characters. The font ROM uses these codes to implement special graphic symbols. For example, the  $06_{16}$  code will generate a spade pattern,  $\spadesuit$ , on the screen. Note that the  $00_{16}$  code is reserved for a blank tile.

The  $2^{11}$ -by-8 font ROM can fit neatly into a single block RAM of the Spartan-3 device. We use the ROM template of Listing 11.6 to ensure that a block RAM will be inferred during synthesis. Part of the HDL code is shown in Listing 13.1. The complete code has  $2^{11}$  rows in constant definition and the file can be downloaded from the companion Web site.

**Listing 13.1** Partial code of the font ROM

```
library ieee;
 use ieee.std_logic_1164.all;
 use ieee.numeric_std.all;
 entity font_rom is
    port (
       clk: in std_logic;
       addr: in std_logic_vector(10 downto 0);
       data: out std_logic_vector(7 downto 0)
    );
10 end font_rom;
 architecture arch of font_rom is
    constant ADDR_WIDTH: integer:=11;
    constant DATA_WIDTH: integer:=8;
    signal addr_reg: std_logic_vector(ADDR_WIDTH-1 downto 0);
    type rom_type is array (0 to 2**ADDR_WIDTH-1)
          of std_logic_vector(DATA_WIDTH-1 downto 0);
    -- ROM definition
    constant ROM: rom_type:=( -2^11-by-8
    -- code x00 (blank space)
    "00000000", --- 0
    "00000000", -\!\!-\!\!1
    "00000000", — 2
    "00000000", --- 3
    "00000000", -- 4
    "00000000", --- 5
    "00000000", -- 6
    "00000000", — 7
    "00000000", — 8
    "00000000", --- 9
    "00000000", — a
    "00000000", -- b
    "00000000", —— c
    "00000000", — d
    "00000000", — e
    "00000000", — f
    - code x01 (smiley face)
    "00000000", --- 0
    "00000000", — \it 1
    "01111110", — 2
    "10000001", --- 3 *
    "10100101", --- 4 * *
    "10000001", — 5 *
    "10000001", — 6 *
    "10111101", — 7 * * * * * *
    "10011001", — 8 *
    "10000001", -- 9 *
    "10000001", -- a *
    "01111110", — b *****
    "00000000", -- c
50
    "00000000", -- d
    "00000000", — e
```



Figure 13.2 Two-stage text generation circuit.

Note that the block RAM-based ROM implementation introduces one-clock-cycle delay, as discussed in Section 11.4.3.

## 13.2.3 Basic text generation circuit

The pixel generation circuit generates the pixel values according to the current pixel coordinates (provided by the pixel\_x and pixel\_y signals) and the external data and control signals. Pixel generation based on a tile-mapped scheme involves two stages. The first stage uses the upper bits of the pixel\_x and pixel\_y signals to generate a tile's code, and the second stage uses this code and lower bits to generate the pixel's value.

The text generation circuit follows this method, and the basic diagram is shown in Figure 13.2. The screen is treated as a grid of 80-by-30 tiles, each containing an 8-by-16 font pattern. In the first stage, the pixel\_x(9 downto 3) and pixel\_y(8 downto 4) signals provides the x- and y-coordinates of the current tile location. The character generation circuit uses these coordinates, combined with other external data, to generate the value of this tile (labeled char\_addr), which corresponds to a character's ASCII code. In the second stage, the ASCII code becomes the seven MSBs of the address of the font ROM and specifies the location of the current pattern. It is concatenated with the four LSBs of the screen's y-coordinate [i.e., pixel\_y(3 downto 0), labeled row\_addr] to form the complete address (labeled rom\_addr) of the font ROM. The output of the font ROM (labeled font\_word) corresponds to an 8-bit row in the pattern. The three LSBs

of the screen's x-coordinate [i.e., pixel\_x(2 downto 0), labeled bit\_addr] specify the desired pixel location, and an 8-to-1 multiplexer routes the pixel to the output.

## 13.2.4 Font display circuit

We use a simple font display circuit to verify operation of the font ROM and display all font patterns on the screen. The 128 patterns are arranged in four rows, which correspond to the four columns of the ASCII table in Table 7.1. We can obtain each pattern by using the proper x- and y-coordinates to generate the desired ASCII code, which is labeled the char\_addr signal. The code segment is

```
char_addr <= pixel_y(5 downto 4) & pixel_x(7 downto 3);</pre>
```

The pixel\_x(7 downto 3) signal forms the five LSBs of the ASCII code, and thus 32 (2<sup>5</sup>) consecutive font patterns will be displayed in a row. The pixel\_y(5 downto 4) signal forms the two MSBs of the ASCII code, and thus four consecutive rows will be displayed. Since the upper bits of the pixel\_x and pixel\_y signals are left unspecified, the 32-by-4 region will be displayed repetitively on the screen. An additional code segment is included to turn on the display for the top-left portion of the screen only. The complete code is shown in Listing 13.2.

Listing 13.2 Pixel generation of a font display circuit

```
library ieee;
 use ieee.std_logic_1164.all;
 use ieee.numeric_std.all;
 entity font_test_gen is
    port (
        clk: in std_logic;
        video_on: in std_logic;
        pixel_x, pixel_y: std_logic_vector(9 downto 0);
       rgb_text: out std_logic_vector(2 downto 0)
    ):
 end font_test_gen;
 architecture arch of font_test_gen is
    signal rom_addr: std_logic_vector(10 downto 0);
    signal char_addr: std_logic_vector(6 downto 0);
    signal row_addr: std_logic_vector(3 downto 0);
    signal bit_addr: std_logic_vector(2 downto 0);
    signal font_word: std_logic_vector(7 downto 0);
    signal font_bit, text_bit_on: std_logic;
20 begin
    -- instantiate font ROM
    font_unit: entity work.font_rom
        port map(clk=>clk, addr=>rom_addr, data=>font_word);
    -- font ROM interface
    char_addr <= pixel_y (5 downto 4) & pixel_x (7 downto 3);</pre>
    row_addr <= pixel_y(3 downto 0);
    rom_addr <= char_addr & row_addr;</pre>
    bit_addr <= pixel_x(2 downto 0);
    font_bit <= font_word(to_integer(unsigned(not bit_addr)));</pre>
    -- "on" region limited to top-left corner
    text_bit_on <=
```

```
font_bit when pixel_x(9 downto 8)="00" and
                      pixel_y(9 downto 6)="0000" else
        ,0,:
    -- rgb multiplexing circuit
35
    process(video_on,font_bit,text_bit_on)
    begin
        if video_on='0' then
            rgb_text <= "000"; --blank
        else
40
          if text_bit_on='1' then
              rgb_text <= "010"; --- green
              rgb_text <= "000"; -- black
45
           end if;
       end if;
    end process;
 end arch;
```

The key part of the code is the font ROM interface. For clarity, we define the following signals for the font ROM, as shown in Figure 13.2:

- char\_addr: 7 bits, the ASCII code of the character
- row\_addr: 4 bits, the row number in a particular font pattern
- rom\_addr: 11 bits, the address of the font ROM; the concatenation of char\_addr and row\_addr
- bit\_addr: 3 bits, the column number in a particular font pattern
- font\_word: 8 bits, a row of pixels of the font pattern specified by rom\_addr
- font\_bit: 1 bit, one pixel of font\_word specified by bit\_addr

The connection of these signals follows the diagram in Figure 13.2. The routing of the font\_bit signal is done by a multiplexer, coded as an array with dynamic index:

```
font_bit <= font_word(to_integer(unsigned(not bit_addr)));</pre>
```

Note that a row (i.e., a word) in the font ROM is defined with a descending order [i.e., (7 downto 0)]. Since the screen's x-coordinate is defined in an ascending fashion, in which the numbers increases from left to right, the order of the retrieved bits must be reversed. This is achieved by the **not** operator in the expression.

We need to combine the synchronization circuit and create the top-level description. The HDL code is shown in Listing 13.3.

Listing 13.3 Top-level description of a font display circuit

```
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity font_test_top is

port(
    clk, reset: in std_logic;
    hsync, vsync: out std_logic;
    rgb: out std_logic_vector(2 downto 0)
    );
end font_test_top;

architecture arch of font_test_top is
    signal pixel_x, pixel_y: std_logic_vector(9 downto 0);
```

```
signal video_on, pixel_tick: std_logic;
    signal rgb_reg, rgb_next: std_logic_vector(2 downto 0);
 begin
    -- instantiate VGA sync circuit
    vga_sync_unit: entity work.vga_sync
       port map(clk=>clk, reset=>reset, hsync=>hsync,
                 vsync=>vsync, video_on=>video_on,
20
                 pixel_x=>pixel_x, pixel_y=>pixel_y,
                 p_tick=>pixel_tick);
    -- instantiate font ROM
    font_gen_unit: entity work.font_test_gen
       port map(clk=>clk, video_on=>video_on,
25
                 pixel_x=>pixel_x, pixel_y=>pixel_y,
                 rgb_text=>rgb_next);
    -- rgb buffer
    process (clk)
    begin
30
        if (clk'event and clk='1') then
           if (pixel_tick='1') then
              rgb_reg <= rgb_next;
          end if;
       end if;
35
    end process;
    rgb <= rgb_reg;
 end arch;
```

There is subtle timing issue in this circuit. Because of the block RAM implementation, the font ROM's output suffers a one-clock-cycle delay. However, since the pixel\_tick signal is asserted every two clock cycles, the pixel\_x signal is remained unchanged within this interval and the corresponding bit (i.e., font\_bit) can be retrieved properly. The rgb multiplexing circuit can use this data, and the desired value is stored to the rgb\_reg register in a timely manner.

## 13.2.5 Font scaling

In the tile-mapped scheme, we can scale a tile pattern to larger sizes by "enlarging" the screen pixels. For example, we can scale the 8-by-16 font to the 16-by-32 font by enlarging the original pixel four times (i.e., expanding one pixel to four pixels). To perform the scaling, we just need to shift pixel coordinates to the right 1 bit and discard the LSBs of the pixel\_x and pixel\_y signals. This can best be explained by an example. Let us repeat the previous font displaying circuit with enlarged 16-by-32 fonts. The screen can now be treated as a grid of 40-by-15 tiles. The new font addresses become

```
row_addr <= pixel_y (4 downto 1);
bit_addr <= pixel_x (3 downto 1);
char_addr <= pixel_y (6 downto 5) & pixel_x (8 downto 4);</pre>
```

The first two statements imply that the same font\_bit value will be obtained when pixel\_x(0) and pixel\_y(0) are "00", "01", "10", and "11", and this effectively enlarges the original pixel to four pixels. The text\_bit\_on condition also needs to be modified to accommodate a larger region:



Figure 13.3 Text generation circuit with tile memory.

We can apply this scheme to scale up the font even further. Note that the enlarged fonts may appear jagged because they simply magnify the original pattern and introduce no new detail.

## 13.3 FULL-SCREEN TEXT DISPLAY

A full-screen text display, as the name indicates, uses the entire screen to display text characters. The character generation circuit now contains a *tile memory* that stores the ASCII code of each tile. The design of the tile memory is similar to the video memory of the bit-mapped circuit in Section 12.5. For easy memory access, we can concatenate the x-and y-coordinates of a tile to form the address. This translates to 12 bits for the 80-by-30 (i.e.,  $2^7$ -by- $2^5$ ) tile screen. Since each tile contains a 7-bit ASCII code, a  $2^{12}$ -by-7 memory module is required. A synchronous dual-port RAM can be used for this purpose. A circuit with tile memory is shown in Figure 13.3.

Because accessing tile memory requires another clock cycle, retrieving a font pattern is now increased to two clock cycles. This prolonged delay introduces a subtle timing problem. Because the pixel\_x signal is updated every two clock cycles, its value has incremented when the font\_word value becomes available. Thus, when the bit is retrieved by the statements

```
bit_addr <= pixel_x(2 downto 0);
font_bit <= font_word(to_integer(unsigned(not bit_addr)));</pre>
```

the incremented bit\_addr is used and an incorrect font bit will be selected and routed to the output. One way to overcome the problem is to pass the pixel\_x signal through two buffers and use this delayed signal in place of the pixel\_x signal.

We use a simple circuit to demonstrate the design of the full-screen tile-mapped scheme. The circuit reads an ASCII code from a 7-bit switch and places it in the marked location

of the 80-by-30 tile screen. The conceptual diagram is shown in Figure 13.3. A cursor is included to mark the current location of entry, where the color is reversed. The cursor block keeps track of the current location of the cursor. The circuit uses three pushbutton switches for control. Two buttons move the cursor right and down, respectively. The third button is for the write operation. When it is pressed, the current value of the 7-bit switch is written to the tile memory. The HDL code is shown in Listing 13.4.

Listing 13.4 Pixel generation of a full-screen text display

```
library ieee;
 use ieee.std_logic_1164.all;
 use ieee.numeric_std.all;
 entity text_screen_gen is
    port (
       clk, reset: std_logic;
       btn: std_logic_vector(2 downto 0);
       sw: std_logic_vector(6 downto 0);
       video_on: in std_logic;
       pixel_x, pixel_y: in std_logic_vector(9 downto 0);
       text_rgb: out std_logic_vector(2 downto 0)
    );
 end text_screen_gen;
15 architecture arch of text_screen_gen is
    -- font ROM
    signal char_addr: std_logic_vector(6 downto 0);
    signal rom_addr: std_logic_vector(10 downto 0);
    signal row_addr: std_logic_vector(3 downto 0);
    signal bit_addr: unsigned(2 downto 0);
    signal font_word: std_logic_vector(7 downto 0);
    signal font_bit: std_logic;
    -- tile RAM
    signal we: std_logic;
    signal addr_r, addr_w: std_logic_vector(11 downto 0);
25
    signal din, dout: std_logic_vector(6 downto 0);
    --80-by-30 tile map
    constant MAX_X: integer:=80;
    constant MAX_Y: integer:=30;
    -- cursor
    signal cur_x_reg, cur_x_next: unsigned(6 downto 0);
    signal cur_y_reg, cur_y_next: unsigned(4 downto 0);
    signal move_x_tick, move_y_tick: std_logic;
    signal cursor_on: std_logic;
    -- delayed pixel count
    signal pix_x1_reg, pix_y1_reg: unsigned(9 downto 0);
    signal pix_x2_reg, pix_y2_reg: unsigned(9 downto 0);
    -- object output signals
    signal font_rgb, font_rev_rgb:
              std_logic_vector(2 downto 0);
 begin
     -- instantiate debounce circuit for two buttons
     debounce_unit0: entity work.debounce
        port map(clk=>clk, reset=>reset, sw=>btn(0),
                  db_level=>open , db_tick=>move_x_tick);
45
```

```
debounce_unit1: entity work.debounce
         port map(clk=>clk, reset=>reset, sw=>btn(1),
                   db_level=>open, db_tick=>move_y_tick);
    -- instantiate font ROM
    font_unit: entity work.font_rom
50
        port map(clk=>clk, addr=>rom_addr, data=>font_word);
    -- instantiate dual-port tile RAM (2^12 - by - 7)
    video_ram: entity work.xilinx_dual_port_ram_sync
        generic map(ADDR_WIDTH=>12, DATA_WIDTH=>7)
        port map(clk=>clk, we=>we,
55
                 addr_a=>addr_w, addr_b=>addr_r,
                 din_a=>din, dout_a=>open, dout_b=>dout);
    -- registers
     process (clk)
     begin
60
       if (clk'event and clk='1') then
           cur_x_reg <= cur_x_next;</pre>
           cur_y_reg <= cur_y_next;</pre>
           pix_x1_reg <= unsigned(pixel_x); --- 2-clock delay</pre>
           pix_x2_reg <= pix_x1_reg;</pre>
65
           pix_y1_reg <= unsigned(pixel_y);</pre>
           pix_y2_reg <= pix_y1_reg;</pre>
       end if;
     end process;
    -- tile RAM write
     addr_w <=std_logic_vector(cur_y_reg & cur_x_reg);</pre>
    we <= btn(2);
    din <= sw:
    -- tile RAM read
    -- use undelayed coordinates to form tile RAM address
    addr_r <=pixel_y(8 downto 4) & pixel_x(9 downto 3);</pre>
     char_addr <= dout;</pre>
    -- font ROM
    row_addr <= pixel_y (3 downto 0);
    rom_addr <= char_addr & row_addr;
80
    -- use delayed coordinate to select a bit
    bit_addr <= pix_x2_reg(2 downto 0);
    font_bit <= font_word(to_integer(not bit_addr));</pre>
    -- new cursor position
    cur_x_next <=
85
        (others=>'0') when move_x_tick='1' and -- wrap around
                            cur_x_reg=MAX_X-1 else
        cur_x_reg + 1
                       when move_x_tick='1' else
        cur_x_reg ;
    cur_y_next <=
        (others=>'0') when move_y_tick='1' and -- wrap around
                            cur_y_reg=MAX_Y-1 else
        cur_y_reg + 1 when move_y_tick='1' else
        cur_y_reg;
    -- object signals
    -- green over black and reversed video for curser
    font_rgb <= "010" when font_bit='1' else "000";</pre>
    font_rev_rgb <="000" when font_bit='1' else "010";</pre>
```

```
-- use delayed coordinates for comparison
     cursor_on <='1' when pix_y2_reg(8 downto 4)=cur_y_reg and
                            pix_x2_reg(9 downto 3)=cur_x_reg else
                  ,0,:
     -- rgb multiplexing circuit
     process(video_on,cursor_on,font_rgb,font_rev_rgb)
     begin
105
        if video_on='0' then
            text_rgb \ll "000"; --blank
        else
            if cursor_on='1' then
               text_rgb <= font_rev_rgb;</pre>
110
               text_rgb <= font_rgb;</pre>
            end if;
        end if;
     end process;
  end arch;
```

The font ROM interface signals are similar to those in Listing 13.2 except that the char\_addr is obtained from the read port of the tile memory. To facilitate the font ROM access delay, we creat two delayed signals, pix\_x2\_reg and pix\_y2\_reg, from the current x- and y-coordinates, pixel\_x and pixel\_y. Note that the undelayed signals, pixel\_x and pixel\_y, are used to form the address to access the font ROM, but the delayed signal, pix\_x2\_reg, is used to obtain the font bit. The instantiation and interface of the dual-port tile RAM is similar to those of the video RAM in Listing 12.7.

The cursor\_on signal is used to identify the current cursor location. The colors of the font pattern are reversed in this location. Because the font bits are delayed by two clocks, we use the delayed coordinates, pix\_x2\_reg and pix\_y2\_reg, for comparison.

The delayed font bits also introduce one pixel delay for the final rgb signal. This implies the overall visible portion of the VGA monitor is shifted to right by one pixel. To correct the problem, we should revise the vga\_sync circuit and use the delayed pix\_x2\_reg and pix\_y2\_reg signals to generate the hsync and vsync signals. Since the shift has little effect on the overall video quality, we do not make this modification.

The top-level code combines the text pixel generation circuit and the synchronization circuit and is shown in Listing 13.5.

Listing 13.5 Top-level system of a full-screen text display

```
signal video_on, pixel_tick: std_logic;
    signal rgb_reg, rgb_next: std_logic_vector(2 downto 0);
 begin
    -- instantiate VGA sync circuit
    vga_sync_unit: entity work.vga_sync
        port map(clk=>clk, reset=>reset,
20
                 hsync=>hsync, vsync=>vsync,
                 video_on=>video_on, p_tick=>pixel_tick,
                 pixel_x=>pixel_x, pixel_y=>pixel_y);
    -- instantiate full-screen text generator
    text_gen_unit: entity work.text_screen_gen
25
        port map(clk=>clk, reset=>reset, btn=>btn, sw=>sw,
                 video_on=>video_on, pixel_x=>pixel_x,
                 pixel_y=>pixel_y, text_rgb=>rgb_next);
    -- rgb buffer
    process (clk)
30
    begin
        if (clk'event and clk='1') then
           if (pixel_tick='1') then
              rgb_reg <= rgb_next;
           end if;
35
        end if;
    end process;
    rgb <= rgb_reg;
 end arch;
```

### 13.4 THE COMPLETE PONG GAME

We create a free-running graphic circuit for the pong game in Section 12.4.3. In this section, we add a text interface to display scores and messages, and design a top-level control FSM that integrates the graphic and text subsystems and coordinates the overall circuit operation. The rules and operations of the complete game are:

- When the game starts, it displays the text of the rule.
- After a player presses a button, the game starts.
- The player scores a point each time hitting the ball with the paddle.
- When the player misses the ball, the game pauses and a new ball is provided. Three balls are provided in each session.
- The score and the number of remaining balls are displayed on the top of the screen.
- After three misses, the game is ended and displays the end-of-game message.

In the following subsections, we first discuss the text subsystem, graphic subsystem, and auxiliary counters, and then derive a top-level FSM to coordinate and control the overall operation. The conceptual diagram is shown in Figure 13.4.

### 13.4.1 Text subsystem

The text subsystem of the pong game consists of four text messages:

• Display the score as "Scores: DD" and the number of remaining balls as "Ball: D" in 16-by-32 font on top of the screen.



Figure 13.4 Top-level block diagram of the complete pong game.

- Display the rule message "Rules: Use two buttons to move paddle up or down." in regular font at the beginning of the game.
- Display the "PONG" logo in 64-by-128 font on the background.
- Display the end-of-game message "Game Over" in 32-by-64 font at the end of the game.

A sketch of the first three messages is shown in Figure 13.5. The end-of-game message is overlapped with the rule message and not included.

Since these messages use different font sizes and are displayed at different occasions, they cannot be treated as a single screen. We treat each text message as an individual object and generate the on status signal and the font ROM address. For example, the logo message segment is

```
logo_on <=
    '1' when pix_y(9 downto 7)=2 and
        (3<= pix_x(9 downto 6) and pix_x(9 downto 6)<=6) else
    '0';
row_addr_1 <= std_logic_vector(pix_y(6 downto 3));
bit_addr_1 <= std_logic_vector(pix_x(5 downto 3));
with pix_x(8 downto 6) select
    char_addr_1 <=
        "1010000" when "011", -- P x50
        "1001111" when "100", -- O x4f
        "1001110" when "101", -- N x4e</pre>
```



Figure 13.5 Text of the pong game.

```
"1000111" when others; —G x47
```

The logo\_on signal indicates that the current scan is in the logo region and the corresponding text should be "turned on." The other statements specify the message content and the font ROM connections to generate the scaled 32-by-64 characters. The other three segments are similar. A separate multiplexing circuit examines various on signals and routes one set of addresses to the font ROM.

The text subsystem receives the score and the number of remaining balls via the ball, dig0, and dig1 ports. It outputs the rgb information via the rgb\_text port and outputs the on status information via the 4-bit text\_on port, which is the concatenation of four individual on signals. The complete code is shown in Listing 13.6.

Listing 13.6 Text subsystem for the pong game

```
library ieee;
 use ieee.std_logic_1164.all;
 use ieee.numeric_std.all;
 entity pong_text is
    port (
       clk, reset: in std_logic;
       pixel_x, pixel_y: in std_logic_vector(9 downto 0);
       dig0, dig1: in std_logic_vector(3 downto 0);
       ball: in std_logic_vector(1 downto 0);
10
       text_on: out std_logic_vector(3 downto 0);
       text_rgb: out std_logic_vector(2 downto 0)
    );
 end pong_text;
15 architecture arch of pong_text is
    signal pix_x, pix_y: unsigned(9 downto 0);
    signal rom_addr: std_logic_vector(10 downto 0);
    signal char_addr, char_addr_s, char_addr_l, char_addr_r,
            char_addr_o: std_logic_vector(6 downto 0);
    signal row_addr, row_addr_s, row_addr_l,row_addr_r,
20
            row_addr_o: std_logic_vector(3 downto 0);
    signal bit_addr, bit_addr_s, bit_addr_l,bit_addr_r,
           bit_addr_o: std_logic_vector(2 downto 0);
```

```
signal font_word: std_logic_vector(7 downto 0);
     signal font_bit: std_logic;
25
     signal score_on, logo_on, rule_on, over_on: std_logic;
     signal rule_rom_addr: unsigned(5 downto 0);
     type rule_rom_type is array (0 to 63) of
         std_logic_vector (6 downto 0);
    -- rule text ROM definition
     constant RULE_ROM: rule_rom_type :=
        -- row 1
        "1010010", --- R
        "1010101", — \it U
35
        "1001100", --L
        "1000101", — \it E
        "0111010", — :
        "0000000", ---
        "0000000", ---
        "0000000", ---
        "0000000", --
        "0000000", ---
        "0000000", --
        "0000000", ---
45
        "0000000", --
        "0000000", ---
        "0000000", ---
        "0000000", --
        -- row 2
50
        "1010101", -- U
        "1110011", — s
        "1100101", — e
        "0000000", ---
        "1110100", — t
55
        "1110111", — w
        "1101111", --- o
        "0000000", ---
        "1100010", -- b
        "1110101", — u
60
        "1110100", — t
        "1110100", — t
        "1101111", — \it o
        "1101110", — n
        "1110011", -- s
        "0000000", ---
        -- row 3
        "1110100", -- t
        "1101111", -- 0
        "0000000", ---
70
        "1101101", —— m
        "1101111", — \it o
        "1110110", — \nu
        "1100101", -- e
        "0000000", —
75
        "1110000", — p
```

```
"1100001", -- a
       "1100100", -- d
       "1100100", -- d
       "1101100", -- l
80
        "1100101", -- e
        "0000000", --
        "0000000", --
       -- row 4
       "1110101", — u
85
        "1110000", -- p
        "0000000", --
        "1100001", — a
        "1101110", -- n
        "1100100", --d
90
        "0000000", ---
        "1100100", --d
        "1101111", --- o
        "1110111", -- w
        "1101110", — n
        "0101110", --- .
        "0000000", --
        "0000000", --
        "0000000", ---
        "0000000" ---
100
     );
  begin
     pix_x <= unsigned(pixel_x);
     pix_y <= unsigned(pixel_y);</pre>
     -- instantiate font ROM
     font_unit: entity work.font_rom
        port map(clk=>clk, addr=>rom_addr, data=>font_word);
     -- score region
110
     — - display score and ball at top left
     -- - text: "Score:DD Ball:D"
        - scale to 16-by-32 font
     score_on <=
115
        '1' when pix_y(9 downto 5)=0 and
                 pix_x(9 downto 4)<16 else
         ,0,:
     row_addr_s <= std_logic_vector(pix_y(4 downto 1));
     bit_addr_s <= std_logic_vector(pix_x(3 downto 1));
120
     with pix_x(7 downto 4) select
       char_addr_s <=
           "1010011" when "0000", — S x53
           "1100011" when "0001", -- c x63
           "1101111" when "0010", — o x6f
           "1110010" when "0011", — r x72
           "1100101" when "0100", — e x65
           "0111010" when "0101", -- : x3a
           "011" & dig1 when "0110", -- digit 10
```

```
"011" & dig0 when "0111", — digit 1
130
           "0000000" when "1000",
           "0000000" when "1001",
           "1000010" when "1010", — B x42
           "1100001" when "1011", — a x61
           "1101100" when "1100", — l x6c
135
           "1101100" when "1101", — l x6c
           "0111010" when "1110", --- :
           "01100" & ball when others;
140
     -- logo region:
           - display logo "PONG" at top center
           - used as background
           - scale to 64-by-128 font
145
     logo_on <=
         '1' when pix_y(9 downto 7)=2 and
            (3 \le pix_x(9 downto 6) and pix_x(9 downto 6) \le 6) else
     row_addr_l <= std_logic_vector(pix_y(6 downto 3));</pre>
150
     bit_addr_1 <= std_logic_vector(pix_x(5 downto 3));</pre>
     with pix_x(8 downto 6) select
       char_addr_1 <=
           "1010000" when "011", --P \times 50
           "1001111" when "100", -- O x4f
155
           "1001110" when "101", — N \times 4e
           "1000111" when others; —G x47
     -- rule region
           - display rule at center
160
           - 4 lines, 16 characters each line
           - rule text:
     __
                Rule:
                Use two buttons
                to move paddle
165
                up and down
     rule_on <= '1' when pix_x(9 downto 7) = "010" and
                           pix_y(9 downto 6)= "0010" else
                  ,0,:
170
     row_addr_r <= std_logic_vector(pix_y(3 downto 0));</pre>
     bit_addr_r <= std_logic_vector(pix_x(2 downto 0));</pre>
     rule_rom_addr <= pix_y(5 downto 4) & pix_x(6 downto 3);</pre>
     char_addr_r <= RULE_ROM(to_integer(rule_rom_addr));</pre>
     -- game over region
        - display "Game Over" at center
        - scale to 32-by-64 fonts
     over_on <=
180
         '1' when pix_y(9 downto 6)=3 and
            5 \le pix_x(9 \text{ downto } 5) \text{ and } pix_x(9 \text{ downto } 5) \le 13 \text{ else}
```

```
00:
     row_addr_o <= std_logic_vector(pix_y(5 downto 2));</pre>
185
     bit_addr_o <= std_logic_vector(pix_x(4 downto 2));
     with pix_x(8 downto 5) select
       char_addr_o <=
          "1000111" when "0101", --G x47
          "1100001" when "0110", — a x61
          "1101101" when "0111", — m \times 6d
          "1100101" when "1000", --e x65
          "0000000" when "1001", ---
          "1001111" when "1010", — O x4f
          "1110110" when "1011", — v \times 76
          "1100101" when "1100", — e \times 65
195
          "1110010" when others; — r \times 72
     -- mux for font ROM addresses and rgb
     process(score_on,logo_on,rule_on,pix_x,pix_y,font_bit,
200
              char_addr_s, char_addr_l, char_addr_r, char_addr_o,
              row_addr_s,row_addr_l,row_addr_r,row_addr_o,
              bit_addr_s, bit_addr_l, bit_addr_r, bit_addr_o)
     begin
        text_rgb <= "110"; -- yellow background
205
        if score_on='1' then
            char_addr <= char_addr_s;</pre>
            row_addr <= row_addr_s;
            bit_addr <= bit_addr_s;</pre>
            if font_bit='1' then
210
               text_rgb <= "001";
            end if;
         elsif rule_on='1' then
            char_addr <= char_addr_r;
            row_addr <= row_addr_r;
215
            bit_addr <= bit_addr_r;
            if font_bit='1' then
               text_rgb <= "001";
            end if;
         elsif logo_on='1' then
220
            char_addr <= char_addr_1;
            row_addr <= row_addr_1;
            bit_addr <= bit_addr_1;
            if font_bit='1' then
               text_rgb <= "011";
225
            end if;
         else -- game over
            char_addr <= char_addr_o;</pre>
            row_addr <= row_addr_o;
            bit_addr <= bit_addr_o;
230
            if font_bit='1' then
               text_rgb <= "001";
            end if:
        end if;
     end process;
235
```

```
text_on <= score_on & logo_on & rule_on & over_on;

--- font ROM interface

rom_addr <= char_addr & row_addr;
font_bit <= font_word(to_integer(unsigned(not bit_addr)));
end arch;
```

The structure of each segment is similar. Because the messages are short, they are coded with the regular ROM template. Since no clock signal is used, a distributed RAM or combinational logic should be inferred. Generation of the two-digit score depends on the two 4-bit external signals, dig0 and dig1. Note that the ASCII codes for the digits  $0, 1, \ldots, 9$ , are  $30_{16}, 31_{16}, \ldots, 39_{16}$ . We can generate the char\_addr signal simply by concatenating "011" in front of dig0 and dig1.

## 13.4.2 Modified graphic subsystem

To accommodate the new top-level controller, the graphic circuit in Section 12.4.3 requires several modifications:

- Add a gra\_still (for "still graphics") control signal. When it is asserted, the vertical bar is placed in the middle and the ball is placed at the center of the screen without movement.
- Add the hit and miss status signals. The hit signal is asserted for one clock cycle when the paddle hits the ball. The miss signal is asserted when the paddle misses the ball and the ball reaches the right border.
- Add a graph\_on signal to indicate the on status of the graph subsystem.

The modified portion of the code is shown in Listing 13.7.

Listing 13.7 Modified portion of a graph subsystem for the pong game

```
-- new ball position
    ball_x_next <=
       to_unsigned((MAX_X)/2,10) when gra_still='1' else
5
       ball_x_reg + ball_vx_reg when refr_tick='1' else
       ball_x_reg ;
    ball_y_next <=
       to_unsigned((MAX_Y)/2,10) when gra_still='1' else
       ball_y_reg + ball_vy_reg when refr_tick='1' else
       ball_y_reg ;
10
    -- new ball velocity
    process(ball_vx_reg, ball_vy_reg, ball_y_t, ball_x_l, ball_x_r,
             ball_y_t,ball_y_b,bar_y_t,bar_y_b,gra_still)
       hit <='0';
15
       miss <= '0':
       ball_vx_next <= ball_vx_reg;
        ball_vy_next <= ball_vy_reg;
        if gra_still='1' then
                                          -- initial velocity
           ball_vx_next <= BALL_V_N;
20
           ball_vy_next <= BALL_V_P;</pre>
        elsif ball_y_t < 1 then
                                          -- reach top
```

```
ball_vy_next <= BALL_V_P;</pre>
        elsif ball_y_b > (MAX_Y-1) then -- reach bottom
           ball_vy_next <= BALL_V_N;
25
        elsif ball_x_1 <= WALL_X_R then -- reach wall
           ball_vx_next <= BALL_V_P; --- bounce back</pre>
        elsif (BAR_X_L \le ball_x_r) and (ball_x_r \le BAR_X_R) and
              (bar_y_t \le ball_y_b) and (ball_y_t \le bar_y_b) then
              -- reach x of right bar, a hit
30
              ball_vx_next <= BALL_V_N; -- bounce back
              hit <= '1';
        elsif (ball_x_r>MAX_X) then — reach right border
           miss <= '1';
                                         -- a miss
       end if;
35
    end process;
    graph_on <= wall_on or bar_on or rd_ball_on;
```

### 13.4.3 Auxiliary counters

The top-level design requires two small utility modules, m100\_counter and timer, to facilitate the counting. The m100\_counter module is a two-digit decade counter that counts from 00 to 99 and is used to keep track of the scores of the game. Two control signals, d\_inc and d\_clr, increment and clear the counter, respectively. The code is shown in Listing 13.8.

Listing 13.8 Two-digit decade counter

```
library ieee;
 use ieee.std_logic_1164.all;
 use ieee.numeric_std.all;
 entity m100_counter is
     port (
        clk, reset: in std_logic;
        d_inc, d_clr: in std_logic;
        dig0,dig1: out std_logic_vector (3 downto 0)
     );
10 end m100_counter;
  architecture arch of m100_counter is
     signal dig0_reg, dig1_reg: unsigned(3 downto 0);
     signal dig0_next, dig1_next: unsigned(3 downto 0);
15 begin
     -- registers
     process (clk,reset)
     begin
        if reset='1' then
           dig1_reg <= (others=>'0');
           dig0_reg <= (others=>'0');
        elsif (clk'event and clk='1') then
           dig1_reg <= dig1_next;</pre>
           dig0_reg <= dig0_next;</pre>
        end if;
25
```

```
end process;
    -- next-state logic for the decimal counter
    process(d_clr,d_inc,dig1_reg,dig0_reg)
    begin
        dig0_next <= dig0_reg;</pre>
30
        dig1_next <= dig1_reg;
        if (d_clr='1') then
           dig0_next <= (others=>'0');
           dig1_next <= (others=>'0');
        elsif (d_inc='1') then
35
           if dig0_reg=9 then
              dig0_next <= (others=>'0');
              if dig1_reg=9 then -- 10th digit
                  dig1_next <= (others=>'0');
              else
40
                  dig1_next <= dig1_reg + 1;
              end if;
           else -- dig0 not 9
              digO_next <= digO_reg + 1;
           end if;
        end if;
     end process;
     dig0 <= std_logic_vector(dig0_reg);</pre>
     dig1 <= std_logic_vector(dig1_reg);</pre>
50 end arch:
```

The timer module uses the 60-Hz tick, timer\_tick, to generate a 2-second interval. Its purpose is to pause the video for a small interval between transitions of the screens. It starts counting when the timer\_start signal is asserted and activates the timer\_up signal when the 2-second interval is up. The code is shown in Listing 13.9.

**Listing 13.9** Two-second timer

```
library ieee;
 use ieee.std_logic_1164.all;
 use ieee.numeric_std.all;
 entity timer is
    port (
       clk, reset: in std_logic;
       timer_start, timer_tick: in std_logic;
       timer_up: out std_logic
    );
10 end timer;
 architecture arch of timer is
    signal timer_reg, timer_next: unsigned(6 downto 0);
 begin
      - registers
    process (clk, reset)
    begin
       if reset='1' then
           timer_reg <= (others=>'1');
        elsif (clk'event and clk='1') then
           timer_reg <= timer_next;
```

```
end if;
     end process;
    -- next-state logic
     process(timer_start,timer_reg,timer_tick)
25
     begin
        if (timer_start='1') then
           timer_next <= (others=>'1');
        elsif timer_tick='1' and timer_reg/=0 then
           timer_next <= timer_reg - 1;</pre>
30
           timer_next <= timer_reg;</pre>
        end if;
     end process;
    -- output logic
     timer_up <='1' when timer_reg=0 else '0';</pre>
 end arch;
```

### 13.4.4 Top-level system

The top-level system of the pong game consists of the previously designed modules, including video synchronization circuit, graphic subsystem, text subsystem, and utility counters, as well as a control FSM and an rgb multiplexing circuit. The block diagram is shown in Figure 13.4.

The control FSM monitors overall system operation and coordinates the activities of the text and graphic subsystems. Its ASMD chart is shown in Figure 13.6. The FSM has four states and operates as follows:

- Initially, the FSM is in the newgame state. The game starts when a button is pressed and the FSM moves to the play state.
- In the play state, the FSM checks the hit and miss signals continuously. When the hit signal is activated, the d\_inc signal is asserted for one clock cycle to increment the score counter. When the miss signal is asserted, the FSM activates the 2-second timer, decrements the number of the balls by 1, and examines the number of remaining balls. If it is zero, the game is ended and the FSM moves to the over state. Otherwise, the FSM moves to the newball state.
- The FSM waits in the newball state until the 2-second interval is up (i.e., when the timer\_up signal is asserted) and a button is pressed. It then moves to the play state to continue the game.
- The FSM stays in the over state until the 2-second interval is up. It then moves to the newgame state for a new game.

The rgb multiplexing circuit routes the text\_rgb or graph\_rgb signals to output according to the text\_on and graphic\_on signals. The key segment is

```
if (text_on(3)='1') or
   (state_reg=newgame and text_on(1)='1') or
   (state_reg=over and text_on(0)='1') then
   rgb_next <= text_rgb;
elsif graph_on='1' then -- display graph
   rgb_next <= graph_rgb;
elsif text_on(2)='1' then -- display logo
   rgb_next <= text_rgb;</pre>
```



Figure 13.6 ASMD chart of the pong controller.

```
else
  rgb_next <= "110"; -- yellow background
end if:
```

The text\_on(3)='1' expression is the condition for the scores, which is always displayed. The text\_on(1)='1' expression is the condition for the rule, which is displayed only when the FSM is in the newgame state. Similarly, the end-of-game message, whose status is indicated by the text\_on(0) signal, is displayed only when the FSM is in the over state. The logo, whose status is indicated by the text\_on(2) signal, is used as part of the background and is displayed only when no other on signal is asserted.

The complete code is shown in Listing 13.10.

Listing 13.10 Top-level system for the pong game

```
library ieee;
 use ieee.std_logic_1164.all;
 use ieee.numeric_std.all;
 entity pong_top is
    port (
       clk, reset: in std_logic;
       btn: in std_logic_vector (1 downto 0);
       hsync, vsync: out std_logic;
       rgb: out std_logic_vector (2 downto 0)
    );
 end pong_top;
 architecture arch of pong_top is
    type state_type is (newgame, play, newball, over);
    signal video_on, pixel_tick: std_logic;
    signal pixel_x, pixel_y: std_logic_vector (9 downto 0);
    signal graph_on, gra_still, hit, miss: std_logic;
    signal text_on: std_logic_vector(3 downto 0);
    signal graph_rgb , text_rgb: std_logic_vector(2 downto 0);
    signal rgb_reg, rgb_next: std_logic_vector(2 downto 0);
    signal state_reg, state_next: state_type;
    signal dig0, dig1: std_logic_vector(3 downto 0);
    signal d_inc, d_clr: std_logic;
    signal timer_tick, timer_start, timer_up: std_logic;
    signal ball_reg, ball_next: unsigned(1 downto 0);
    signal ball: std_logic_vector(1 downto 0);
 begin
    -- instantiate video synchronization unit
    vga_sync_unit: entity work.vga_sync
       port map(clk=>clk, reset=>reset,
                 hsync=>hsync, vsync=>vsync,
                pixel_x=>pixel_x, pixel_y=>pixel_y,
                video_on=>video_on, p_tick=>pixel_tick);
    -- instantiate text module
    ball <= std_logic_vector(ball_reg); --type conversion</pre>
35
    text_unit: entity work.pong_text
       port map(clk=>clk, reset=>reset,
                 pixel_x => pixel_x, pixel_y => pixel_y,
                 dig0=>dig0, dig1=>dig1, ball=>ball,
                 text_on=>text_on, text_rgb=>text_rgb);
```

```
-- instantiate graph module
    graph_unit: entity work.pong_graph
        port map(clk=>clk, reset=>reset, btn=>btn,
                pixel_x=>pixel_x, pixel_y=>pixel_y,
45
                gra_still=>gra_still, hit=>hit, miss=>miss,
                graph_on=>graph_on,rgb=>graph_rgb);
    -- instantiate 2-sec timer
    timer_tick \leftarrow -- 60-Hz tick
        '1' when pixel_x="000000000" and
                 pixel_y="0000000000" else
50
        ,0;;
    timer_unit: entity work.timer
        port map(clk=>clk, reset=>reset,
                 timer_tick=>timer_tick,
                 timer_start=>timer_start,
55
                 timer_up=>timer_up);
    -- instantiate 2-digit decade counter
    counter_unit: entity work.m100_counter
        port map(clk=>clk, reset=>reset,
                 d_inc=>d_inc, d_clr=>d_clr,
60
                 dig0=>dig0, dig1=>dig1);
    -- registers
     process (clk, reset)
     begin
        if reset='1' then
           state_reg <= newgame;
           ball_reg <= (others=>'0');
           rgb_reg <= (others=>'0');
        elsif (clk'event and clk='1') then
           state_reg <= state_next;
70
           ball_reg <= ball_next;
           if (pixel_tick='1') then
             rgb_reg <= rgb_next;
           end if:
        end if;
75
    end process;
    -- fsmd next-state logic
     process(btn,hit,miss,timer_up,state_reg,
             ball_reg,ball_next)
     begin
        gra_still <= '1';</pre>
        timer_start <='0';</pre>
        d_inc <= '0';
        d_clr <= '0';
        state_next <= state_reg;
85
        ball_next <= ball_reg;
        case state_reg is
           when newgame =>
              ball_next <= "11";</pre>
                                     -- three balls
              d_clr <= '1';
                                     -- clear score
90
              if (btn /= "00") then -- button pressed
                 state_next <= play;
                 ball_next <= ball_reg - 1;
```

```
end if:
           when play =>
              gra_still <= '0';</pre>
                                   -- animated screen
               if hit='1' then
                  d_inc <= '1';
                                    -- increment score
               elsif miss='1' then
                  if (ball_reg=0) then
100
                     state_next <= over;
                  else
                     state_next <= newball;
                  end if;
                  timer_start <= '1'; --- 2-sec timer
                  ball_next <= ball_reg - 1;
              end if;
           when newball =>
              -- wait for 2 sec and until button pressed
              if timer_up='1' and (btn /= "00") then
110
                state_next <= play;
              end if;
           when over =>
              -- wait for 2 sec to display game over
              if timer_up='1' then
115
                   state_next <= newgame;
              end if;
         end case:
     end process;
     -- rgb multiplexing circuit
120
     process(state_reg, video_on, graph_on, graph_rgb,
             text_on, text_rgb)
     begin
        if video_on='0' then
           rgb_next <= "000"; -- blank the edge/retrace
125
        else
            – display score, rule or game over
           if (text_on(3)='1') or
              (state_reg=newgame and text_on(1)='1') or -- rule
              (state_reg=over and text_on(0)='1') then
130
              rgb_next <= text_rgb;
           elsif graph_on='1' then -- display graph
             rgb_next <= graph_rgb;
           elsif text_on(2)='1' then -- display logo
             rgb_next <= text_rgb;
135
           else
             rgb_next <= "110"; -- yellow background
           end if;
        end if;
     end process;
     rgb <= rgb_reg;
 end arch;
```

#### 13.5 BIBLIOGRAPHIC NOTES

Several other character fonts are available. *Rapid Prototyping of Digital Systems* by James O. Hamblen et al. uses a compact 64-character 8-by-8 font set. The tile-mapped scheme is not limited to the text display. It is widely used in the early video game. The article "Computer Graphics During the 8-bit Computer Game Era" by Steven Collins (*ACM SIG-GRAPH*, May 1998) provides a comprehensive review of the history and design techniques of the tile-based game.

#### 13.6 SUGGESTED EXPERIMENTS

### 13.6.1 Rotating banner

A rotating banner on the monitor screen moves a line from right to left and then wraps around. It is similar to the Window's Marquee screen saver. Let the text on the banner be "Hello, FPGA World." The banner should be displayed in four different font sizes and can travel at four different speeds. The font size and speed are controlled by four switches. Derive the HDL description and then synthesize and verify operation of the circuit.

#### 13.6.2 Underline for the cursor

The full-screen text display circuit in Section 13.3 uses reversed color to indicate the current cursor location. Modify the design to use an underline to indicate the cursor location. Derive the HDL description and then synthesize and verify operation of the circuit.

## 13.6.3 Dual-mode text display

It is sometimes better for text to be displayed on a "vertical" screen. This can be done by turning the monitor 90 degrees and resting it on its side. Design this circuit as follows:

- 1. Modify the full-screen text display circuit in Section 13.3 for a vertical screen.
- 2. Merge the normal and vertical designs to create a "dual-mode" text display. Use a switch to select the desired mode.
- 3. Derive the HDL description and then synthesize and verify operation of the circuit.

### 13.6.4 Keyboard text entry

Instead of switches and buttons, it is more natural to use a keyboard to enter text. We can use the four arrow keys to move the cursor and use the regular keys to enter the characters. Use the keyboard interface discussed in Section 8.4 to design the new circuit. Derive the HDL description and then synthesize and verify operation of the circuit.

#### 13.6.5 UART terminal

The UART terminal receives input from the UART port and displays the received characters on a monitor. When connected to the PC's serial port, it should echo the text on Window's HypterTerminal. The detailed specifications are:

- A cursor is used to indicate the current location.
- The screen starts a new line when a "carriage return" code (0d<sub>16</sub>) is received.



Figure 13.7 Tile patterns and encoding of square wave.

- A line wraps around (i.e., starts a new line) after 80 characters.
- When the cursor reaches the bottom of the screen (i.e., the last line), the first line will be discarded and all other lines move up (i.e., scroll up) one position.

Derive the HDL description and then synthesize and verify operation of the circuit.

# 13.6.6 Square wave display

We can draw a square wave by using four simple tile patterns shown in Figure 13.7(a). Follow the procedure of a full-screen text display in Section 13.3 to design a full-screen wave editor:

- 1. Let the tile size be 8 columns by 64 rows. Create a pattern ROM for the four patterns.
- 2. Calculate the number of tiles on a 640-by-480 resolution screen and derive the proper configuration for the tile memory.
- 3. Use three pushbuttons for control and a 2-bit switch to enter the pattern.
- 4. Derive the HDL description and then synthesize and verify operation of the circuit.

## 13.6.7 Simple four-trace logic analyzer

A logic analyzer displays the waveforms of a collection of digital signals. We want to design a simple logic analyzer that captures the waveforms of four input signals in "free-running" mode. Instead of using a trigger pattern, data capture is initiated with activation of a pushbutton switch. For simplicity, we assume that the frequencies of the input waveform are between 10 kHz and 100 kHz. The circuit can be designed as follows:

- 1. Use a sampling tick to sample the four input signals. Make sure to select a proper rate so that the desired input frequency range can be displayed properly on the screen.
- 2. For a point in the sampled signal, its value can be encoded as a tile pattern by including the value of the previous point. For example, if the sampled sequence of one signal is "00001111000", the tile patterns become "00 00 00 01 11 11 11 10 00 00", as shown in Figure 13.7(b).
- 3. Follow the procedure of the preceding square wave experiment to design the tile memory and video interface to display the four waveforms being stored.
- 4. Derive the HDL description and then synthesize the circuit.

To verify operation of the circuit, we can connect four external signals via headers around the prototyping board. Alternatively, we can create a top-level test module that includes a 4-bit counter (say, a mod-10 counter around 50 kHz) and the logic analyzer, resynthesize the circuit, and verify its operation.

## 13.6.8 Complete two-player pong game

The free-running two-player pong game is described in Experiment 12.7.6. Follow the procedure of the pong game in Section 13.4 to derive the complete system. This should include the design of a new text display subsystem and the design of a top-level FSM controller. Derive the HDL description and then synthesize and verify operation of the circuit.

# 13.6.9 Complete breakout game

The free-running breakout game is described in Experiment 12.7.7. Follow the procedure of the pong game in Section 13.4 to derive the complete system. This should include the design of a new text display subsystem and the design of a top-level FSM controller. Derive the HDL description and then synthesize and verify operation of the circuit.